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EDITORIAL 



The Practice of Forth 



Forth always has been a language whose success was rooted not in theory but in prac- 
tice. Despite a general lack of corporate or university sponsorship — with apologies to those 
companies and institutions of higher learning in which Forth has indeed been championed 
over the years — it has been in the trenches that Forth has proven its efficacy, efficiency, 
and vitality. A few publications have objectively documented Forth's strengths but most, 
relying by necessity on advertising dollars and appeal to mass interests in order to address 
their understandably bottom-line concerns, largely have ignored it. This is not to say that 
Forth is an unpublished language; the Bibliography of Forth References which was maintained 
for a number of years by The Institute for Forth Application and Research, documented a 
surprising depth and breadth of coverage, both academic and popular, of this language. 
(The Bibliography, when last I saw it, was sadly out of date; if updated, it probably would 
double its already impressive size.) 

Despite the fondest wishes of many. Forth has never achieved mass appeal. Instead, it 
has suffered the fate of the long-distance runner, whose success lies in crossing the finish 
line, not in besting the pack. 

But Forth mostly is a tool for toolbuilders and problem solvers, not the mass market. Its 
adaptability and flexibility have been of most value in situations calling for outstanding 
performance under unusual constraints. Fast development needed? Skillful Forth program- 
mers regularly deliver full-featured programs in the time required by skilled users of other 
languages to deliver an initial prototype. Few resources available? Forth's model allows a 
degree of application functionality that can only be viewed as incredible in hardware that 
barely accommodates the run-time kernel of other languages. 

Of course, true to its historical trend, this is swimming upstream. General practice these 
days — at least the tales that make news and drive up costs for consumers and small enter- 
prises — is to throw more-expensive hardware at a problem, to deploy larger programming 
teams, to design solutions that ultimately will require expensive maintenance and admin- 
istrative personnel until a bigger, costlier solution relegates the old one to the scrap heap. 

But in the trenches, the troops carry on. Alone or in teams, proficient Forth programmers 
continue the daily work of finding appropriate niches, and of delivering good work on time. 
Forth's greatest asset is the integrity and diligence of its users who appreciate the benefits 
inherent in, or which can be coaxed from, what the mainstream might view as limitations. 

Since its inception. Forth also has benefitted from the efforts of an even smaller minor- 
ity of adherents, a few people whose public contributions have been not so much the pro- 
grams they write or features they introduce to the language, but their ability to help this 
dispersed community of independent-minded users to cohere and communicate and coop- 
erate in ways that benefit everyone. The loss of one of those people, as happened last spring, 
reminds us to be very grateful for each person who takes the time and thought necessary to 
share their experience, knowledge, and even wisdom, with the rest of us. 



In Memoriam 

With great regret, we must report that Robert Reiling passeS away on Wednesday, May 5 
of this year. 

In the Forth community, Mr. Reiling was the director of the annual FORML Conference, 
and was a past President of the Forth Interest Group. His diplomacy and professional de- 
meanor, as well as his personal commitment and friendliness, could always be relied upon, 
and he will be missed. His dedication and encouragement also extended to groups that in- 
cluded the seminal Homebrew Computer Club and local ham radio enthusiasts. 

Bob had contracted cancer, and responded to treatment favorably enough to direct the 
20th FORML Conference last November and, shortly thereafter, to resume his full-time 
work until the illness recurred. 

We extend our condolences to Bob's friends and family and, like many others, are very 
grateful for his contributions and support. 
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High Accuracy Table Lookup 
Using Cubic Interpolation 



This algorithm basically trades speed for table size by as- 
suming that the line joining points in a lookup table is really 
a curve. The value in question rests on the curve between the 
two middle points of a four-point segment. The curve is as- 
sumed to be a third-degree polynomial that passes through 
all four points. 

Intended for use on small processors, this code uses only 
integer arithmetic. I originally wrote it to calculate various 
transcendental functions to 16-bit precision. There are more 
efficient ways to approximate such functions, but the gen- 
eral-purpose method presented here lends itself to arbitrary 
functions, too. 

The theory behind the algorithm is as follows: 

Given points yO, yl, y2, and y3, there is a point f(x) be- 
tween yl and y2 where the region of interest is < x < 1. 
f(x) = wO + wl * X + w2 * x^2 + w3 * x^3 

For four equally spaced points (n = -1,0,1,2), f(n) gives 
four equations: 

f(-l) = yO = wO - wl + w2 - w3 
f(0) = yl = wO 

f{l) = y2 = wO + wl + w2 + w3 
f(2) = y3 = wO + 2wl + 4w2 + 8w3 



Simultaneously solving these equations yields the follow- 
ing coefficients upon which the algorithm is based: 

wO = yl 

wl = (-2yO - 3yl + 6y2 - y3) / 6 

w2 = ( 3y0 - 6yl + 3y2 ) / 6 

w3 = ( -yO + 3yl - 3y2 + y3) / 6 

The word cubic 4 does the approximation using four data 
points at an address. CUBIC does some indexing and scaling 
in order to be useful in using a lookup table. 

The algorithm takes some shortcuts to keep the math 
simple, so a wildly varying lookup table could cause an over- 
flow. In typical applications, you won't come close to this 
situation,, but it always pays to test. 

The example given here represents the first quadrant of a 
sine function using 19 data points. This gives better than 16- 
bit precision. An 80 point table gives a maximum error of 
about .004 PPM. 



\ Table Lookup Using Cubic Interpolation 



d2* 
d4* 
d6* 



2dup d+ 
d2* d2* 
d2* d3* 



d3* 
d5* 
dl6* 



2dup d2* d+ ; 
2dup d4* d+ ; 
d4* d4* ; 



8 cells 
bytes 



constant cellbits \ bits/cell assuming byte addressing 

\ change if your address units aren't 



1 cellbits 1- Ishift 2constant wround \ i.e. 0x00008000 for 16-bit 
Forth 



variable wptr \ points to the input data 

■i 

: @seq ( — d ) 

\ get next point for coefficients ( write in assembly for speed ) 
wptr 8 @ s>d 

[ 1 cells ] literal wptr +! ; 



Brad Eckert • Mesa, Arizona 
bradbits@hotmail.com 
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wl 



( a — n ) 
wptr ! 0. 
@seq d2* d- 
@seq d6* d+ 



\ e 

@seq d3* d- 
@seq d- 



wl 



drop 



w2 



( a — n ) 
wptr 1 . 
@seq d3* d+ 
@seq d3* d+ 



\ 6 



@seq d6* d- 



w2 



drop 



w3 (a — n ) \ 6 * w3 

wptr ! 0. 

@seq d- @seq d3* d+ 

@seq d3* d- @seq d+ drop 



cterm ( frac nl n2 — nS ) \ n3 = nl * frac + n2 

>r m* d2* wround d+ nip \ trunc --> round 
r> + ; 



: cubic4 ( frac a — n ) \ frac = 0..maxint 

\ perform cubic interpolation on 4-cell table at a 

>r dup dup r@ w3 \ w3 

r@ w2 cterm \ w3* f + w2 

r@ wl cterm 6 / \ (w3*n*n + w2* n + wl) / 6 

r> cell+ @ cterm ; \ *n + yl 



: tcubic ( nl addr — n2 ) 

\ perform cubic interpolation on table at addr 

\ nl = 0. .2''cellsize-l 

dup cell+ >r @ ( nl tablesize | addr ) 

um* >r 1 rshift r> ( frac offset | addr ) 
cells r> + cubic4 ; 



: CUBIC ( nl span addr -- n2 ) 

\ perform cubic interpolation on table at addr, nl = 0..span-l 
>r >r swap r> um/mod nip 
r> tcubic ; 



create exampletable \ Sine table (1st quadrant) 

16 , C 16 points plus 3 endpoints ) 

-3212 , , 3212 , 6393 , 9512 , 12540 , 15447 , 18205 , 

20788 , 23170 , 25330 , 27246 , 28899 , 30274 , 31357 , 32138 , 
32610 , 32767 , 32610 , \ clipped to maxint for 16-bit 4ths 

.( 32768* sin (lOdegrees) is ) 10 90 ExampleTable CUBIC . 
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User Stacks in ANS Forth 



Forth programmers are, of course, familiar with the con- 
cept of the information stack, since the data stack and return 
stack are at the heart of Forth. Here I would like to remind 
readers of the concept of a stack as an abstract data type. In this 
view, a stack is defined in terms of the things you can do with 
it, regardless of the implementation details that make those 
things possible. In this view, a stack is characterized as follows: 

• You can put things on a stack. 

• You can take things off a stack. 

• The thing taken off is always the last thing put on. 

Here we present words to create and manipulate stacks 
implemented as a linked list. 

Figure One illustrates the principle of the linked list. The 
rectangles represent nodes — a number of contiguous memory 
locations. These blocks of memory do not have to be next to 
each other, nor must they all be of the same size, nor do they 
have to be in order (although any of these conditions may be 
imposed by an implementor in the name of performance ef- 
ficiency, depending on the application). 

The key idea is the existence of a link field (shown in Fig- 
ure One at the left end of each node) that points from one 
node to the next. There is a separate pointer to the head of the 
list, and the pointer of the last node is a null pointer, pointing 
to nothing. In Forth, it is convenient to use zero as a null 
pointer, since it is easy to test for and there are few systems 
that would allow memory location zero to be the valid start- 
ing address of a link node. Variations on this theme include 
having pointers to other locations on the list, circular lists 
(where the last item points to the first item) and doubly linked 
lists (with pointers going in both directions). 

Linked lists are important in the computer world because: 

• they can be traversed almost as rapidly as accessing 
contiguous memory locations, 

• items can be added or removed "on the fly," therefore, 

• they use memory efficiently. 

We now have a pretty good problem specification. We need 
Forth words to: 

• create a user stack 

• push items onto the user stack from the Forth data stack 

• pop items off the user stack onto the Forth data stack 

It would also be handy if, following a pop, performing a 
push restored the items to the user stack in the same order 
they had been (making push and pop reciprocal operations). 

The accompanying code shows one way to do this. 

We have the defining word... 

: stack CREATE , ; 



Usage: stack mystack (creates a stack named mystack); 
then mystack puts the address of the pointer to the top of 
the user stack on the Forth data stack. 

CREATE lays down the necessary header information for a 
new word in the Forth dictionary (itself often a rather com- 
plicated linked list or lists). , gives the word a cell of data 
space and initializes it to the null pointer (since the stack is 
empty when created). When mystack is executed, its action 
will be to put the address of its cell of data space on the Forth 
data stack. Since that is all we need or want, there is no need 
for further action by a does> in this simple defining word. 

Figure One 

pointer 




Now for push, which will create and populate a new node. 
We need a link field, which we will put first. This is a handy 
position, since the address of its cell will be the first node 
information available, and this way we can get at everything 
else with simple positive offsets. Since we want to be able to 
use variable-size nodes, the next cell will contain the node 
size, necessary overhead for this capability. The third cell will 
be the first of the cells containing the data of the node. 

The size specification could be either the number of ac- 
tual data cells or the actual node size, both data and over- 
head. My personal preference, implemented here, is to use 
the total node size. This means the programmer needs to re- 
member to bump the size specification to the number of ac- 
tual data cells plus two. Push will take items off the Forth 
data stack one by one and store them in the node in order, so 



Leonard Zettel • Dearborn, Michigan 
Zettei@acm.org 
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\ This is an ANS Forth Program requiring the Memory-Allocation word set 

\ Words to handle a user-created stack as a linked list with nodes of arbitrary size. 



: stack CREATE , ; 



: node_size ( node-addr -- node-size) CELL+ @ ; 

: n! ( nl . . nn addr n --) \ Store nl to nn in consecutive cells 

\ starting at addr. 
CELLS OVER + SWAP DO I ! 1 CELLS +LOOP ; 

: n@ ( addr n -- nl . . nn) \ Fetch n consecutive values starting at 

\ addr + (wordsize) * (n-1 ) & leave them 
\ on the stack. 
1- CELLS OVER + DO I @ -1 CELLS +LOOP ; 



: node. ( addr — ) \ Display the contents of the node at addr. 

DUP @ U. DUP CELL+ @ CELLS OVER + SWAP CELL+ DO I @ . 1 CELLS +LOOP ; 



: list. ( ptr — ) \ Display the contents of the stack pointed to by ptr. 
CR DUP @ 0= IF ." stack empty" DROP EXIT THEN 
CR BEGIN @ ?DUP WHILE DUP node. CR REPEAT ; 

\ Thanks to Marcel Hendrix for noting that ALLOCATE works in address units. 
: push ( nn . . nl addr --) \ Push nl . . nn onto the stack pointed to 

\ by addr. nn is the node size in cells 

OVER >R R@ ( get_node) 

CELLS ALLOCATE \ Get node space 

ABORT" push: ALLOCATE failed." 

>R DUP @ \ Get address of node at the top of the node stack 

R@ ROT ! \ Make new node top of stack 

R> R> n! ; \ Store node contents. 



pop ( addr -- nn . . nl) \ Pop stack pointed to by addr, leaving 

\ node values on the stack and freeing 

\ the node space. 

DUP @ DUP 0= ABORT" Empty user stack." 

DUP @ ROT ! DUP >R 

CELL+ DUP @ 1- n@ R> 

FREE ABORT" pop: FREE failed" ; 



the item deepest on the Forth data stack will be in the node's 
end position. 

Looking at the code for push, OVER >R R@ parks a copy of 
the node size on the return stack. CELLS ALLOCATE gets a' 
node-sized chunk of memory, ABORTing with an error mes- 
sage if for some reason it could not do so. >R parks the ad- 
dress of the new node on the return stack. DUP makes a copy 
of the address of the pointer to top-of-user-stack. @ puts the 
address of the cunent top-of-user-stack on the Forth data stack. 
R@ puts the address of the new node on the Forth data stack, 
ROT puts the address of the pointer- to-top-of-user-stack on 
top of it, and ! stores the new node address in pointer-to- 
top-of-user-stack. R> puts the address of the new node on the 
Forth data stack. The next R> puts the node size on top of it, 
and our word n ! populates the node. 



Since we mentioned it, let's take a look at n ! (n-store). We 
need to store n items of information, each the size of a cell, 
in consecutive memory cells starting at addr. The obvious way 
to do this is with a DO ... LOOP. So let's see... we could set up 
the following: 

DO SWAP OVER I CELLS + ! LOOP DROP ; 

This would do it. DO sets up the loop parameters, swap 
puts the next item to store on top of the stack, over puts a 
copy of the base address over it. I CELLS + gives the address 
the proper offset. ! stores the item. At the end of the loop, 
we are left with the base address on the stack, so we drop it. 
Not too bad. 

Can we do better? Suppose we could arrange it so that i 
furnished the storage address itself instead of a count. Then 
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the business contents of the loop could simply be i ! , but i 
would have to increase by the number of address units in a 
cell. We could do that with 1 CELLS +L00 P. Okay, so far we 
have DO I ! 1 CELLS +L00 P. Then we notice we no longer 
need the dro p. All that's left is figuring out how to set up the 
proper DO range. The first value of i has to be addr. The last 
value of I used will be addr plus (n-l)*(address units in a cell). 
Given the rules governing DO loops, this means an upper limit 
of addr plus (K)*(address units in a cell), because with an in- 
creasing index, the iteration stops one pass short of the loop 
limit. We can get the required loop parameters with cells 
OVER + SWAP. CELLS switches us from number of cells to 
number of address units. OVER + gets the required upper limit. 
SWAP puts things in order for the following DO. At the cost of 
some preliminary setup work, we have reduced the number 
of words inside the loop (where most of the work will be done) 
from seven words to four, a fair savings. 

Let's look at pop, the inverse operation of push. First we 
check that there is indeed something on the stack to pop, 
ABORT"ing if there isn't. Assuming we pass that test, the stack 
picture is now (addrl addr2), with addrl being the address of 
the pointer-to-top-of-stack, and addr2 the address of the first 
cell of the top of stack, dup @ puts the pointer to the next item 
down (if any) on top of the data stack, rot ! makes the pointer- 
to-top-of stack point to that item, since that will be the new 
top of stack after the pop completes, dup >r parks a copy of 
the address of the item to be popped on the return stack. CELL+ 
bumps the address to the cell containing the size of the node. 
dup @ 1- gives the parameters needed by n@, which puts the 
required information on the data stack. Finally, R> FREE gives 
the space occupied by the popped node back to the system, 
since the application no longer needs it. Doing this here means 
we don't have to worry further about garbage collection, which 
can be a headache. We'll let the system take care of that, since 
it should be more competent to do so. 

n@ follows the pattern of n! with some adjustments for 
circumstances. Here the index has to start at the high address 
and count down, thus the -1 CELLS + LOO P. The first address 
fetched will be at addr + («-l)*(cell size in address units), so we 
have 1- CELLS OVER +. Because this loop will be counting 
down, the final value of I in the loop will be the limit value, 
which we set to addr. So we see that what at first seems to be 
a Forth idiosyncrasy turns out to be nicely suited to the uses 



of zero-based addressing, where n items are indexed as «(0), 
m(1).. M(n-l) rather than h(1)... m(w). 

At this point we have covered how to create a user stack, 
and how to push items on to it or pop them off. Another 
handy thing to do (perhaps while debugging an application) 
is list the contents of a stack. For this we have list, ("list- 
dot"). Given a pointer to a list, we look at the next pointer 
and, while it is non-null, we display the node contents with 
node, (which follows the same principles as n!) and then go 
on to the next item. node, has some complications that come 
from dealing with messy reahties. Addresses in Forth can cover 
the full range of unsigned numbers, so the first cell is dis- 
played using u . while the remaining values are diplayed with 
. (dot). This leads to some complications in setting up the 
DO parameters. We can still get the upper limit with DOP CELL+ 
@ CELLS OVER +, but since we have already displayed the 
contents of the first cell, we increment the DO starting value 
using CELL+, 

Now that we have reviewed everything, let's try a simple 
example: 

Stack, mystack 

...will create an empty user stack named mystack. 
Mystack list 

...will produce the message "stack empty," since we haven't 
put an)^hing in it yet. So let's follow up with: 

567 3 mystack push mystack list. 
We should now see 3 5 6 7 . Now let's try: 

mystack 

pop mystack push 

1009 885 234 5 mystack push 

mystack list. 

(On as many lines as you like, with as many uses of .S as you 
prefer). Using SwiftForth™ from FORTH, Inc. I saw: 

22282240 5 234 885 1009 

3 567 

ok 

...which is what I should have seen. 



Forth-Geseilschaft eV (Germany's FIG) has changed the name and address of its web site. 
The new URL is: 

http://www.forth-ev.cle 

For the benefit of those who do not read Gerrtian, at press time, a translation of the whole site into 
English was in preparation. 

The site's webmaster is Dr. Egmont Woitzel, member of the Board of Directors of Forth-Gesellschaft. 

Up to now, the work put into the new site is entirely due to Dr. Egmont Woitzel and Professor Dr. 
Thomas Beierlein,both from the Directorial Board of Forth-Gesellschaft. Much additional work comes 
from Friederich Prinz, Editor of Vierte Dimension and Member of the Directorial Board. 
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Aspects of a Particular 
Three-Stack Machine Design 



Abstract 

H3sm ("Hohensee's 3-stack machine") is a demo imple- 
mentation of a virtual computing machine with three dis- 
tinctly featured stacks, plus a Size register controlling the data 
stack. The stacks are the Return Stack, the Pointer Stack, and 
the Data Stack. The Data Stack, the Size register and affiliated 
ALU and stack operators implement a fundamental type called 
a pyte, which is an integer at the current value of Size. Size 
varies from one to 256 eight-bit bytes. Pointers and return 
addresses and their respective stacks are address-bus cells, as 
usual. H3sm currently has only a vestigial interpreter and no 
interpretive threading (compiler) capability. The current H3sm 
does demonstrate pyte arithmetic. 

GNU C source code for H3sm is at http://linux01.gwdg.de/ 
~rhohen/H3sm.html, and is heavily commented. 

H3sm and this essay are primarily the work of Rick (Rich- 
ard Allen) Hohensee, with distinct improvements by Michael 
Somos (http://grail.cba.csuohio.edu/-somos/). Amongst other 
things, Somos generalized the code for either-endian hosts, 
which I did not intend to address myself. 

Impetus 

The idea of a three-stack "Forth" has been gnawing at me 
for several years. Around 1992, 1 attempted and failed to write 
a three-stacker on the Commodore 64. At the time, I thought 
a doubly linked dictionary was a good idea, and I ran out of 
steam trying to implement that in 6502 machine language. 
Jonah Thomas (JET) has since pointed out that I could do the 
things 1 wanted without double linking. The H3sm dictio- 
nary linking is fairly conventional in this regard, so, in true 
Forth style, JET must be credited for something that, thank- 
fully, isn't in H3sm. 

Several things about a conventional Forth bug me or just 
seem curious. The absence of microprocessor-style conditional 
flags, the plethora of size-typed operators, and the fact that 1 
can never, to this moment, remember the order of operands to 
Forth 1 ("store"). 1 have hoped that three stacks can make a 
useful distinction between data and pointers, which will solve 
my little ! problem, and will also provide some reduction in 
the namespace-explosion that is one of Perth's weak points. 

Also curious is what 1 see as the missing coda to Phil 
Koopman's Stack Machines: The New Wave. This book describes 
the history of computing engines in terms of the number of 
stacks they have. Koopman points out that stacks like the 
typical CPU return stack and the Forth parameter stack are 
implicit to the instruction sets of their respective machines, 
and are not addressed, as registers are on machines with 
multiple similar registers. Koopman shows that computers 
have improved noticeably as they went from zero, to one, 
and then to two stacks. However, I don't recall much conjee- 
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ture in the book on more than two stacks, or any compelling 
case for two being the absolute upper limit. 

H3sm therefore begs to beg the question Koopman begs. 
Well then, many have said, why not 1024 stacks, or what- 
ever? Because, if they're all the same, you wind up with waste- 
ful addressing bits in the opcodes again. The key lies in the 
fact that with a small number like three, each "stack" can 
have properties distinct from the others. With two, you don't 
have much flexibility. With three, data items can be differ- 
ent in size than addresses. Variably sized, in fact. (Koopman's 
book is on the web in its entirety, by the way.) 

Looking at machine design very subjectively, a Forth is a 
nice little assortment of data structures/mechanisms. Forth's 
openness and simplicity allows re-use of its component parts. 
H3sm adds a couple of distinct parts to the toolkit. An H3sm 
models a machine with an address bus of typical size, and 
may help abstract the size of the data bus over a wide range 
of possible sizes. 

The name pyte originally was from "precision byte," and 
Size originally was called "Precision." My technical back- 
ground is in the field operations of land surveying, where 
one develops a mindset in which numbers are duals, with a 
unit and a precision. I've wanted a computer that can change 
itself from a low-precision implement to a high-precision 
implement — such as from a surveyor's manual "Chinese 
Ninety" or an artist's outstretched thumb, to a first-order tri- 
angulation theodolite — with the change of one variable. 

Design 

Each of the three H3sm "stacks" has a behavior that is 
distinctly different from the other two. 

return 

The return stack is rather typical, containing address-size ex- 
ecution tokens. One day, we might do loop indices and such 
on the return stack, too. 

pointer 

The H3sm pointer stack is address-cell sized. The pointer stack 
is "sluggish"; it is not auto-pop/push. The pointer stack 
pointer is usually left pointing to the recently referenced cell. 

data 

The data stack operates on pytes, groups of 1, 2, 4, 8, 16. ..256 
bytes. Boolean flags are the low byte of a pyte. False is zero. 
Non-zero is true. The H3sm true word asserts 255 in a flagpyte. 

Size "register" 

The current effective size of data stack operands is the Size 
state variable. There are user-visible accessors of this Size "reg- 
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ister." Operations on pytes are in terms of Size, except where 
a pyte is treated as a flag, aka a flagpyte. 

So the three stacks are the data typing of H3sm; typing is 
enforced by the various operands. The data stack is where a 
datum can be treated arbitrarily. There are a few ops to move 
things from stack to stack, with some conversion and data 
loss in some cases, as may be necessary between pytes and 
addresses, and to and from Size. 

The above data structures are defined by their interaction 
with the H3sm primitives in Table One [see following page]. 
(I kinda like the term atoms in lieu of the usual primitives, by 
the way.) It is messy, but not huge. I count 97 words. These 
atoms were more than sufficient to write the simple inter- 



preter. The interpreter is about 20 non-atomic words, written 
as (C-compiled-in) threads of the atoms. Glaring omissions 
include -,*,*/, and move. Available flow-control is rudi- 
mentary. Conversely, there's about a dozen scaffolding con- 
stants and so on that could easily be done without. Note that, 
in exchange for things like p+s and so on, we don't have any 
of the likes of 2+, 2dup, et al. 

The functionality of the above atoms may be more than 
you think at first glance. The math and logic that does exist 
works at any size from 1 to 256 bj^es. Fairly rich pointer- 
twiddling is also available. 1 would describe this as "thicker" 
than a Forth. A quick session vnth some pyte arithmetic may 
illustrate some of this thickness, r is the register picture word. 
Numbers are hex. 



Listine One. Sample session with pyte arithmetic. 

(the next 2 lines are my florid shell prompt, with input of "H3sm") 

$ cLIeNUXO /dev/tty3 r 00:30:15 /mount/bl/H3sm 

$H3sm 

total Virtual Address Space including dictionary is 65536 bytes, 
actual address of VAS is 0xbffe5d2c 

gcc-compiled at 22:37:38 on Dec 28 1998 



latest 
r 



bffe8674 



RETURN 
a34 
a50 





rsl= 



POINTER 







psl= 



(this is ourHSsm input line, r ) 

DATA pyte Size = 4 

msB, lower bytes > 

00 00 00 00 T.O.D.S. 
00 00 00 02 
00 00 00 00 
00 00 00 00 
00 00 00 00 

dsl = = IsB of TOS ip = 2520 



0-TAY! 
44444444 



66666666 10101010 r 



RETURN 
a34 
a50 





rsl= 



POINTER 







psl= 



DATA 
msB, 



(more input, 3 #'s and another r) 

pyte Size = 4 
lower bytes > 



10 
66 
44 
00 
00 



10 
66 
44 
00 
00 



dsl = 



10 
66 
44 
00 
00 
12 



10 
66 
44 
00 
02 



T.O.D.S, 



IsB of TOS ip = 2520 



0-TAY! 

2222 + r (etc.) 

RETURN POINTER DATA pyte Size = 4 

a34 msB, lower bytes > 

a50 10 10 32 32 T.O.D.S. 

66 66 66 66 

44 44 44 44 

00 00 00 00 

00 00 00 02 
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Table One. 






(P; begins a pointer stack comment. 




(R; is a Return Stack comment. 




( is a Data Stack comment. 






1 1 1 means "below (left of) here is required but not changed." 




HNC is Head Name Cell of a dictionary word. 




Atom name 


Stacks effects 


Size effects, comments 


address 


(P; - ptr ) 




AND 


( pytea pyteb — pyteaandb ) 




bytemask 


( — Oxff ) 




dualmask 


( — Oxffff ) 


might be handy for Unicode 






1 !SIZE 


call 


(R; - xt ) 




cells 




4 !SIZE 


aint 


( flagp — Iflagp ) 


the NOT of a flag 


bump 


(— junk) 


bye 




return to caller of H3sm 






with flag byte of TOS 


charsize 




1 ! SIZE (a cheat) 


doHNC 


(P; HNIC — ) (R; — RETInull ) 


Forth EXECUTE 


downsize 




shift Size down, or to one ! SI ZE 


drop 


( pyte — ) 




dup 


( pytea III — pytea ) 




ell 




unconditional branch 


!P 


(P; p store III ) 


store a ptr 


extend 




emit 


(pyte — ) 


treated as a char 


false 


( - Oflag ) 




@ 


(P; ptrlll-)(- pyte) 




@size 


(P; noun — noun ) 


! SIZE 


flag 


( pyte — flagpyte ) 


bytewise OR a pyte into its low byte 


four 


( - 4 ) 


pj^e constant 


gap 


( — ptra-b ) (P; ptra ptrb III — ) 


hostf n 


( — sh.ret.val ) (P; epa bpa III ) 




IBUFFERO 






max 


( a b — maxab ) 




NOT 


( pyte — Ipyte ) 




negate 


( a — 2's_complement_negative_a ) 




last 


(P; — count.byte.addr ) 




literal 


( — pyte ) 


ISIZE 


-P 


( pyte — ) (P; ptr — ptr-intpartofpyte ) 




no 


( flagpyte — ) 


conditional branch if true 


nothing 




NOP 


nown 


(P; — nown body ) 




ones 


( - -1 ) 


or fffff 


one 


( - 1 ) 


pyte constant 


OR 


( pytea pyteb — pyteaORb ) 




over 


( a b — aba) 




pdrop 


(P; ptr - ) 


deer pointer stack lubber 


pdup 


(P; ptra — ptra ptra ) 


period 


( — 46 ) 


ASCII . pyte constant 


p@ 


(P;ptrl — ptr2) 


ptrl overwritten 


+ 


( a b — c ) 


+P 


( pyte — ) (P; ptr — ptr+bytepartpyte ) 




(Table continues on next page.) 
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Atom n^iTif^ 




3izc errecis, coinincnis 


p-s 


(P- rifr r»tr-<ii7*» ^ 

\r , pil — pir-olic ) 




p+s 


\i f pir — ptr+oizc ) 




P+}d 


\i , pir — pir+i ) 




p+c 


\l , pLI pLiT*T ) 




p-c 


^^1 , pu — pir+*i ) 




p-c 


\L , pti — pir+*t ) 




pTO s 


CP* Si7e III ^ 


1 Si ze 


sTOp 


/p. Ci7<a \ 

V,-! , — olZc j 




P ■ 


\l , MUlC p 111 — - SIUIC p ) 




pswap 






p>r 


(i-*, ptr III ) (R, — ptr ) 




nTlP 


\i , — oiupir ) 




pus h 


(V- in ^ 

ip ) 






a U Hagpyic ) 




rdrop 


(K, a — ) 






) 




X. poup y 


/p. a III \ /p. a \ 


clup r to p 


r-^p 


^K, pir — ; \v , — pir ) 




r> s 


(R; size — ) 


! SIZE 


ssveDi ct lonsry 






s icfn 










pyte constant, aeciniai lo 


s i 2 Gd 


\i , — pir ; 






( -— ^17** ^ 




g> IT 


\i^f JliC ^ 




spa.ce 


V — j-ipyte ) 


pyte constant for a space 


1 


W , ptr III ) \ pyte — j 






V pyica pyicu — pytcD pytca _j 






C mi 


pyte constant, decimal 10 


t niree 


C 11 


pyte constant 


t ime 


( — iitimf* inf" 1 


/ 1 C T 7 C 




^1 , riiNLy — v_,oae_oOay_v^eii ) 




T01a.s t 


\i , pir — ) 


update latest/last 


Tn 1 1 n V 

i\j ± ±iii\ 


(P- T-TlSir' I inb Poll 1 




> s 


( size " ) 


1 C T "7 T? 


t irue 


( — true_flagpyte ) 




two 


( — 2 ) pyte constant 




ushif t 


( shiftee amount — shifted ) 




upsize 




! SIZE 


V doUd o fc! 


\i f — autir.or.vas.x[uj ) 


implementation requirement 




fP* V»r*a 111 \ 

\i f upa III — cpd ) 


blocks flow 


ft c 1 

U b i- 


\ wiidi ever . . . — ) 




Opsl 


(P; what ever ... — ) 




Of si 


CP* TArViat" f>\7Pr — ^ 
\i\t Wlldl cvci ... — f 




XOR 


( pytea pyteb — pyteaXORb ) 




yes 


( flagpyte — ) 


conditional branch if false. 


zero 


( - ) 


as a pyte constant 


ok 






r 




machine language-monitor-style stack pic 


tdump 


(P; text III — ) 
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rsl= 2 



psl= 



dsl = 12 = IsB of TOS ip = 2520 



0-TAY! 
2 TO size + 
RETURN 
a34 
a50 





rsl= 



POINTER DATA pyte Size = 2 

msB, lower bytes > 

42 42 T.O.D.S. 

66 66 

66 66 

44 44 

44 44 
psl= dsl = 12 = IsB of TOS 



iP 



2520 



0-TAY! 

8 TOsize dup 
RETURN 
a34 
a50 





rsl= 2 



POINTER DATA pyte Size = 8 

msB, lower bytes > 

42 42 66 66 66 66 44 44 T.O.D.S. 

42 42 66 66 66 66 44 44 

44 44 00 00 00 00 00 00 

00 02 00 00 00 00 00 00 

00 00 00 00 00 00 00 00 
psl= dsl = 14 = IsB of TOS ip = 2520 



0-TAY! 
bye 

$ cLIeNUXO /dev/tty3 r 00:36:44 /mount /bl /H3sm 
$ 



In the above, we did + at two and four bytes, and dup at 
eight bytes, to the pyte. This is operator vectoring, not over- 
loading. There is no interpreting involved. I'm told that big- 
ger adds are slow in silicon, without lots of extra silicon, but 
wide Booleans could be a big win in a relatively small amount 
of silicon if you have a use for them. More important may be 
the semantic freedom to design an algorithm for pytes, and 
to use it for whatever size data is appropriate at any particu- 
lar moment. The H3sm interpreter is very nearly Unicode- 
transparent for this reason, although there are one or two 
charsize assumptions in the current code. 

Implementation 

For those who don't care to browse the source, H3sm is a 
rather nasty piece of C. H3sm is distinctly not what C likes to 
do. My interest in C in this context is simply as a portable 
assembler, and the code reflects that intent. As little as pos- 
sible of C's sophistication is used. All of H3sm is in a single C 
function, mainO- H3sm uses GNU C labels-as-values, in pure 
mimicry of Gforth. This is a GNU extension to C that is a foriti 
of computed GOTO, and is in H3sm's NEXT macro; i.e., it is 
essential to H3sm's threading scheme. I suspect H3sm's thread- 
ing scheme, which I call Virtual Machine Subroutine Thread- 
ing, has a unique aspect. It is similar to what has been called 
"call threading" (by Ertl or Paysan, I think, in comp.lang.forth). 
However, H3sm has no w, no "Working Register." An atomic bit 
is necessary in the headers of atoms (primitives) to distinguish 
them from threads. The resultant threading behavior is slightly 
less confusing to me, resembling normal subroutine calls a bit 
more than most other schemes. 



One possible benefit of this cretinous C style is simple 
embeddability. It is trivial to rename mainQ and include 
H3sm in something else. This doesn't give any communica- 
tion between H3sm and the host code, however. A bootable 
version of H3sm should not be terribly difficult, either, and 
perhaps would be more interesting than an embedded one. 

The interpreter and pre-threader compiled-in threads in 
H3sm are very wasteful code, both in terms of code and 
memory used by the executable, which, at about 90K, is too 
big for such a simple program. Mercifully, that stuff only runs 
at startup. There was value in doing them the way I did, 
though, because the cpp macros served as a preview of the 
language and of what it would be like to program it from the 
interpreter. 

In C, Size can be any integer between 1 and 256 inclu- 
sive. In silicon. Size may better work as 1, 2, 4, 8, etc. Maybe 
not. As it is, a size that can match, e.g., Intel floating-point 
^ stack item sizes, is a happy accident of this implementation. 

Loop indices are pytes. Pytes are relatively worse perfor- 
mance-wise in C than they would be in silicon, and will be 
the next thing I change in H3sm. I'm quite pleased that 
H3sm loops benchmark at about half the speed of Perl, for 
such a fragile demo, but with int loop indices on the re- 
turn stack she should run distinctly more like a Forth. 

impressions 

Well, / like it. I think it was worth doing. I see some pos- 
sibility for pytes to reduce the "What is an int?" problems 

Three-Stack Machine continues on page 67. 
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Polyalphabetic Encryption Cracker 



The Polysub — well-known but not very secure 

In this article, we will present the Polyalphabetic Substi- 
tution Cipher (the "PolySub"), which most readers should be 
familiar with. This is the one in which a key is repeatedly 
XOR'd with the plaintext to produce the ciphertext (or vice- 
versa). We will then present a program which will crack this 
cipher, working on the assumption that the plaintext charac- 
ters have varjang frequencies and that one plaintext charac- 
ter in particular is much more frequent than the others (we 
assume that this character is the blank). This article is ori- 
ented toward novices, so we provide a lot of implementa- 
tion-level description of our encryption-cracking program. 

On computers, the PolySub is usually implemented with 
XOR. This allows the same program to be used for both en- 
cryption and decryption, since XOR undoes itself. In pre-com- 
puter days, plus and minus were used. The PolySub was in- 
vented in 1568 by Leon Battista and was used extensively by 
the Union Army during the American Civil War. 

Let's look at an example of the PolySub using plus and 
minus. We will use an alphabet of all capitals encoded 
through 26 (a blank is a zero), and we will use a key of "DOOR" 
(see Figure One). 

We are using modular addition to encrypt. The message 
would be decrypted by using modular subtraction. This 
method is an excellent system for anybody who doesn't have 
access to an electronic computer. The reason is that one can 
easily construct a "computer" out of cardboard. 

The idea is to have a circular piece of cardboard riveted, 
through its center, to another piece of cardboard so that it 
can be spun freely. Both wheels have the alphabet written on 
them clockwise. To encrypt or decrypt, one locates the cur- 
rent letter from the key stream on the inner wheel and lines 
it up with the blank on the outer wheel. 

If encrypting, one then locates the current message letter 
on the outer wheel and finds the corresponding encryption 
letter on the inner wheel. If decrypting, 
one would locate the current encrypted 
letter on the inner wheel and find the cor- 
responding message letter on the outer 
wheel. Note that the famous Julius Caesar 
encryption scheme (adding three to every 
letter) is just a degenerative form of the 
Plus-Minus scheme. It has only a single 
character long key (the "C"). The Julius 
Caesar scheme is a Monoalphabetic Sub- 
stitution cipher. 

The PolySub's security can be enhanced 
a little bit by having a long key. This is best 
accomplished by repeatedly encrypting the 
message. For example, if your key is 



"DOORFENCE" the key length is nine. On the other hand, if 
the message is first encrypted with "DOOR" and then with 
"FENCE", the effect is the same as if it was encrypted once with 
a twenty-character (4*5) key of "JTCUIUTFGTUWRRTXICRW". 
The security is actually a little better, because the effective twenty- 
character key is a jumble of characters and can't be as easily 
guessed as "DOORFENCE" which is composed of recognizable 
English words. When doing multiple encryptions like this, one 
should be sure that none of the key lengths have common de- 
nominators. If they are the same lengths, for example "DOOR" 
and "GATE", it is still effectively a four-character key (although 
the characters at least are jumbled as "KPIW"). 

The first part of the CrakPoly program is the code to load 
and save files, and to encrypt and decrypt them. After that, 
we get into cracking ciphers for which we don't have a key. 
There are two phases to cracking the PolySub: the first is de- 
termination of the key length, and the second is determina- 
tion of the key contents. 

Preliminary Code — encrypting and decrypting files 

Our PolySub cracking program is called " CrakPoly. scr" and 
is written in UR/Forth from Laboratory Microsystems, Inc. 
The source code is in Figure Two. CrakPoly should run under 
any Forth-83 compiler. It has been tested under both 32-bit 
and 16-bit UR/Forth. The reader is encouraged to put Ql (pro- 
vided on screen 5) inside various words as an aid to dissecting 
the program. Execution will stop and the user can examine 
the contents of variables before continuing with the program. 

We have two data buffers, ciphertext and plaintext. 
These each have file_size bytes of memory allocated to 
them. FILE_SIZE is defined in screen 1 and is currently set 
quite small, so readers with eight-bit computers can load and 
run the program. Readers with 32-bit computers should set 
FILE_SIZE larger. 

The word input file in screen 17 is used to load a file 



Figure One. 

meatloaf for DINNER <~ the plaintext (unencrypted) message 
DOORDOORDOORDOORDOO <~ the key stream 

QTPKPCPXDUCIDSXERTF <~ the Ciphertext (encrj^ted) message 

Below is the same thing in numerics. 
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Figure Two. 

Screen # 
\ CRAKPOLY 



19:36 05-29-99 



Crack the polyalphabetic substitution cipher (XOR) . 
written by Hugh Aguilar 

January/February/March/April 1999 Forth Dimensions 



Screen # 1 

\ word size arithmetic CHARS MOSTEST 



20:33 05-30-99 



WSIZE CONSTANT W 



\ less cumbersome to type 



these 


depend upon 


W+ 


4 + ; 


W- 


4 - ; 


W* 


2* 2* ; 


W/ 


2/ 2/ ; 



256 CONSTANT CHARS 

CREATE MOSTEST , BL MOSTEST C! 
\ most frequent plain char 



5000 CONSTANT FILE SIZE 



\ maximum file size 



Screen # 2 

\ LOW_ENCRYPT LOW_DECRYPT for Plus-Minus system 11:39 05-31-99 
\\ Plus-Minus system 

: LOW_ENCRYPT \ plain_char key_char — cipher_char 
+ DUP CHARS >= IF CHARS - THEN ; 

: LOW_DECRYPT \ cipher_char key_char -- plain_char 
- DUP 0< IF CHARS + THEN ; 



Screen # 3 

\ LOW_ENCRYPT LOW_DECRYPT for Minus-Plus system 20:12 05-30-99 
\\ Minus-Plus system 

: LOW_ENCRYPT \ plain_char key_char -- cipher_char 
- DUP 0< IF CHARS + THEN ; 

: LOW_DECRYPT \ cipher_char key_char -- plain_char 
+ DUP CHARS >= IF CHARS - THEN ; 



Screen # 4 

\ LOW_ENCRYPT LOW_DECRYPT for XOR system 
\ XOR system 



11:39 05-31-99 



: LOW_ENCRYPT \ plain_char key_char -- cipher_char 
XOR ; 

: LOW_DECRYPT \ cipher_char key_char — plain_char 
XOR ; 



into memory. It takes two parameters, 
the filename and the buffer pointer. The 
filename should be the address of a 
counted string containing the fully 
qualified filename. The buffer pointer 
should be either ciphertext or 
PLAINTEXT. output_file is also in 
screen 17 and also takes a filename and 
a buffer pointer, but it outputs the con- 
tents of the buffer to the file. 

If there is a document in plaintext, 
executing the word encrypt in screen 
14 will fill ciphertext with the en- 
crypted version of the document. Ex- 
ecuting the word decrypt, which is also 
in screen 14, will decrypt the document 
in ciphertext and fill plaintext with 
the unencrypted version. 

Note that ENCRYPT and decrypt use 
the words low_encrypt and 
low_decrypt which are defined in 
screen 4. These words in screen 4 are for 
the XOR PolySub. We also have versions 
of low_encrypt and low_decrypt in 
screens 2 and 3. Both of these screens 
are commented out. Screen 2 is for the 
Plus-Minus PolySub, and screen 3 is for 
the Minus-Plus PolySub. If the reader is 
using either of these kinds of PolySub, 
he should comment out screen 4 and 
compile screen 2 or 3, instead. 

Phase 1. — Determining key length 

In order to determine the key's 
length, we need to assume that the char- 
acters in the plaintext are of varying fre- 
quencies. We don't care which charac- 
ters are more frequent than the others 
or how they are distributed, so long as 
they aren't rectangularly distributed. We 
will repeatedly shift the ciphertext over 
and compare it against the original 
unshifted version of the ciphertext. We 
count how many of the characters be- 
ing compared are equal to the charac- 
ter they line up against in the unshifted 
version. 

We have an array called COINCI- 
DENCES. The first index is the count of 
coincidences for ciphertext being 
shifted over by one character, the sec- 
ond index is the count of coincidences 
for ciphertext being shifted over by two 
characters, and so forth. C0UNT_C0IN- 
CIDENCES in screen 18 counts these 
coincidenses. coincidences actually 
contains percentages, rather than raw 
counts, because a different number of 
comparisons is done by each call to 
count_coincidences. Our percentages 
have two decimal digits to the right of 
the decimal point. 
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FlLL_co INCIDENCES in Screen 19 
calls COUNT_co INCIDENCES repeatedly 
and fills the coincidences array. 
Note that we have a word called 
search size which determines how 
many shifts we do. If our file is small, 
we only do a third of the total. This is 
because the more we shift, the less ac- 
curacy we have. If we did the entire 
file size, our numbers toward the end 
would be garbage and would only 
mess us up. Note that, the way the 
author originally had CO UNT_co INCI- 
DENCES written, it would rotate 
ciphertext around such that the char- 
acters at the end of the file would be 
compared to the characters at the be- 
ginning. In this way, we wouldn't get 
decreasing accuracy with increased 
shifts. This turned out to be a bad idea, 
because it caused coincidences to get 
counted more than once, which 
tended to smooth out the numbers. 

Screen 21 contains the word SH0W_ 
COINCIDENCES which uses these 
words to show what is in the COIN- 
CIDENCES array. If the reader uses 
SHOW_COINCIDENCES tO look at CO- 
INCIDENCES, he should see there are 
spikes in the values. These spikes oc- 
cur on multiples of the length of the 
key used to encrypt ciphertext. By 
eyeballing COINCIDENCES, it is fairly 
easy for the user to determine the key 
length. 

We want our program to determine 
this automatically, however. There is 
some difficulty in this, because it is not 
clear what threshold a value must be 
over to be considered a spike. This 
threshold value varies with the data. 
Also, no matter how carefully the 
threshold value is set, some values 
which are spikes don't go over it, and 
some which aren't do go over it. There 
is a lot of variance in the data, espe- 
cially when cracking small files. 

We set our threshold to the mid- 
point of the data in COINCIDENCES. 
This is done by calc_threshold in 
screen 22. The author originally tried 
using a constant value of 4%. This 
didn't work, because the threshold is 
at different heights, depending upon 
the length of the key. The author then 
tried using the average. This didn't 
work either; it was way too small, es- 
pecially when the key length was 
large, and we got a lot of false spikes. 
The next attempt was to use the aver- 
age plus the standard deviation mul- 
tiplied by some empirically chosen 
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Screen # 5 

\ miscellaneous words 

: #? \ d — new_d \ used in <# . 
2DUP D0= IF BL HOLD ELSE # 

: #_ \ d -- new_d \ used in <# . 
2DUP D0= IF ELSE # 

: QI \ — 

QUERY INTERPRET ; 



12:03 05-30-99 

. #> for the most sig digits 
THEN ; 

. #> for the most sig digits 
THEN ; 



ROVER \abc — abca \" rot over" 
2 PICK ; 

ZERO \ adr -- \ zeros out the word at ADR 
SWAP ! ; 



Screen # 6 

\ miscellaneous words 13:23 05-31-99 

: U>= \ a b -- flag 
U< 0= ; 

: INC \ adr -- \ increments the value 
1 SWAP +! ; 

: P_ALLOT \ — \ allots enough that HERE is paragraph aligned 
HERE 16 MOD ? DUP IF 16 SWAP - ALLOT THEN ; 

: PCREATE \ allotment — \name \ paragraph aligned CREATE 
P_ALLOT HERE >R ALLOT R> CONSTANT ; 

\ Don't use PCREATE in conjunction with DOES> . 



Screen # 7 

\ CARRAY WARRAY 19:39 05-30-99 

\ Note that "base_adr" means the address provided by DOES> 

: CARRAY \ size -- \name \ paragraph aligned char array 

CREATE HERE >R , P_ALLOT HERE R> ! ALLOT 
DOES> \ index base_adr -- address 
@ + ; 

: WARRAY \ size — \ name \ word array 

CREATE W* ALLOT 

DOES> \ index base_adr -- address 
SWAP W* + ; 



Screen # 8 

\ 2CARRAy WITHIN 19:39 05-30-99 

\ Note that "base_adr" means the address provided by DOES> 

: 2CARRAY \ horz_size vert_size -- \name \ 2D char array 
CREATE OVER , DUP , * ALLOT 

DOES> \ horz_index vert_index base_adr -- address 
DUP W+ W+ >R \ return: data_adr — 
@ \ horz_index vert_index horz_size — 

* + R> + ; 

: WITHIN \ char lowest highest — flag 
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>R >R 
DUP R> >= 



SWAP R> <= 



AND 



Screen # 9 

\ PRINTABLE NUMERIC SPANISH 

: PRINTABLE \ char — flag 
32 127 WITHIN ; 

: NUMERIC \ char -- flag 

ASCII ASCII 9 WITHIN 
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SPANISH \ char 
>R 

R@ 144 = OR 
R@ 162 = OR 
R@ 165 = OR 



- flag \ accented chars and upside-down ? 
R@ 129 = R@ 130 = OR 

R@ 160 = OR R@ 161 = OR 

R@ 163 = OR R@ 164 = OR 

R@ 168 = OR R> 173 = OR ; 
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\ These are char_kind filter words. 
Screen # 10 

\ UPPERCASE ALPHA ALPHANUMERIC PUNCTUATION 
: UPPERCASE \ char -- flag 

ASCII A ASCII Z WITHIN ; 

: LOWERCASE \ char — flag 

ASCII a ASCII z WITHIN ; 

: ALPHA \ char — flag 

DUP UPPERCASE SWAP LOWERCASE OR ; 

: ALPHANUMERIC \ char -- flag 

DUP ALPHA SWAP NUMERIC OR ; 



: PUNCTUATION \ char — flag \ also includes the blank 

DUP ALPHANUMERIC 0= SWAP PRINTABLE AND ; 
\ These are char kind filter words. 



Screen # 11 

\ constants and variables 
100 CONSTANT KEY_SIZE 
KEY_SIZE CARRAY KEY_STRING 

KEY_SIZE WARRAY KEY_LENGTHS VARIABLE BIG_KEY_LENGTHS 

KEY_SIZE CHARS 2CARRAY KEY_CHAR 

VARIABLE KEY LENGTH \ actual key size 
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FILE_SIZE PCREATE CIPHERTEXT 
FILE_SIZE PCREATE PLAINTEXT 
VARIABLE FILE_MORE 
VARIABLE FILE_LENGTH 
VARIABLE PAST_CIPHER 

250 CONSTANT NON_CHAR 
16 CONSTANT DUMP_WIDTH 
18 CONSTANT SHOW KEYS 



\ where we try to put more of file 

\ actual file si;ze 

\ ptr past valid data in CIPHERTEXT 

\ print this for nonprintable chars 

\ horizontal chars in DUMP display 

\ keys shown by SHOW_KEY 



Screen # 12 

\ constants and variables DOSINT FILEl 

300 CONSTANT MAX_SEARCH_SIZE 

MAX SEARCH SIZE WARRAY COINCIDENCES 
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constant. For example, a constant of .68 
will result in 75% of the values being 
under the threshold. This worked bet- 
ter, but it was overly complicated and 
still not good enough. 

The midpoint worked best and was 
very simple. We have spikes clustered 
around some high value and non-spikes 
clustered around some low value. There 
are more non-spikes than spikes, espe- 
cially when the key length is long, and 
this is what was messing us up when 
we were using the average. This dispar- 
ity was what we were trying to compen- 
sate for with the standard deviation. By 
using the midpoint, we avoid concern- 
ing ourselves with how many spikes 
there are, relative to the number of non- 
spikes. The midpoint draws a line 
betwean the highest value and the low- 
est value, and this line pretty much 
separates the spikes from the non- 
spikes. CALC_THRESHOLD doesn't have 
to be perfect, because the key_lengths 
array, described next, smooths over er- 
rors caused by values being seen as 
spikes when they are non-spikes, and 
vice-versa (as long as there aren't too 
many errors). 

We have an array called ke y_lengths 
as big as our maximum key size, and 
which we will fill with percentage 
probabilites of the key being any par- 
ticular length. We have to do this be- 
cause there is no way to be absolutely 
sure of the key length, due to the vari- 
ance mentioned earlier. fill_key_ 
LENGTHS in screen 23 fills this array. This 
word calculates the distances betwean 
the spikes. If all these distances were the 
same, we would know for sure that this 
distance was the key length. They usu- 
ally aren't, so we just count the times 
we see the different distances. 

These counts go in key_lengths. 
KEY_LENGTHS% in screen 24 converts 
these counts into percentages. This is 
mostly for aesthetic purposes when dis- 
playing them later; calc_key_length 
doesn't need it done. We also have a 
variable called big_key_lengths 
which counts any spike distances which 
are too big to fit in key_lengths. Hope- 
fully, this will be zero. 

calc_key_length calculates the 
actual key length. First it fills key_ 
lengths, then it searches through 
key_lengths for the biggest value. The 
index to this value is our key length. If 
we have two or more values which are 
equal, we go with the smallest index. 
In almost all cases when this happens, 
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the higher index is a multiple of the 
lower one. The smallest is the actual key 
length (otherwise, we would have a key 
which was some string repeated some 
number of times). 

Screen 25 contains FILL_KEY_ 
LENGTH, which does everything needed 
to determine the key length. This is the 
word the user will type at the keyboard 
in order to do phase one of the program. 
Note that, if the user disagrees with the 
program's idea of what the key length 
is, he can use key_length! to set it 
manually. fill_key_length displays 
the front portion of COINCIDENCES at 
the top of the screen. This raw data is 
only marginally useful. fill_key_ 
LENGTH displays key_lengths at the 
bottom of the screen. The user can see 
here what the probabilities of the vari- 
ous key lengths are. These are a guide 
for what to give ke y_LENGTH ! if the user 
disagrees with what the program found 
to be the most likely. In practice, this is 
rarely needed; fill_key_length is al- 
most always correct. 

Phase 2. — Determining key contents 

We are ready for phase two, determi- 
nation of what the contents of the key 
are. The individual characters of the key 
are solved for as if they were of distinct 
Mono-alphabetic ciphers. The second 
phase of the program the author found 
to be more straightforward than the first 
phase. It is all downhill from here! 

In screen 1, we have a variable called 
mostest which contains the plain char- 
acter we think will be the most fre- 
quently occuring. This defaults to the 
blank. This value is not normally 
changed during the program's execu- 
tion. It is made a variable rather than a 
constant, however, because the user may 
want to change it if he is decrypting 
some file which is not text. This change 
can be made without having to 
recompile the program. Note that, 
sometimes even in English text, the 
blank is not the most frequent charac- 
ter. Consider Figure Three, in which 'e' 
is the most frequent. 



Figure Three. 

Elephants are very eloquent. 
Especially Penelope! 



The program will still successfully 
crack ciphers like this. The text file for 
this article has 1.74 times as many 
blanks as 'e' characters. The ratio might 



10000 constant unity \ multiplier for percents 

\ percents with two digits to right of decimal point 

VARIABLE THRESHOLD \ height to be considered a spike 
CHARS WARRAY FREQS \ count of encryption results 



\ vector to LOW_ENCRYPT or LOW_DECRYPT 
\ vector to char kind checking word 
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VARIABLE 'LOW_ENCRYPT 
VARIABLE 'CHAR_KIND 
DOSINT 

CONSTANT READ_ONLY 

1 CONSTANT WRITE_ONLY 

2 CONSTANT READ_WRITE 

HCB FILEl \ handle control block 



Screen # 13 

\ <ENCRYPT> 

VARIABLE SRC \ either CIPHERTEXT or PLAINTEXT 
VARIABLE DST \ either CIPHERTEXT or PLAINTEXT 

: ADVANCE_KEY_INDEX \ key_index — new_key_index 
1+ DUP KEY LENGTH @ = IF DROP THEN ; 



: <ENCRYPT> \ source dest -- \ either CIPHERTEXT or PLAINTEXT 
DST ! SRC ! \ key_index -- 
FILE_LENGTH @ DO 

SRC @ I + C@ OVER KEY_STRING C@ 'LOW_ENCRYPT PERFORM 

DST @ I + C! 

ADVANCE_KEY_INDEX LOOP 
DROP ; 



Screen # 14 

\ ENCRYPT DECRYPT GET_KEY 

: ENCRYPT \ -- 

[ '] LOW_ENCRYPT 'LOW_ENCRYPT ! 
CIPHERTEXT FILE_SIZE ERASE 
PLAINTEXT CIPHERTEXT <ENCRYPT> ; 

: DECRYPT \ -- 

[ '] LOW_DECRYPT ■LOW_ENCRYPT ! 
PLAINTEXT FILE_SIZE ERASE 
CIPHERTEXT PLAINTEXT <ENCRYPT> ; 

: GET_KEY \ cipher_char plain_char ■ 
LOW DECRYPT ; 
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key_char 



Screen # 15 

\ KEY_LENGTH! KE?Y_STRING! SHOW_KEY_STRING 

: KEY_LENGTH! • \ key_length — 

DUP KEY_SIZE > ABORT" too long of a key" 
KEY_LENGTH ! ; 

: KEY_STRING! \ counted_string — 
COUNT DUP KEY_LENGTH! 
DO 

DUP C@ I KEY_STRING C! 
1+ LOOP 
DROP ; 
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: SHOW_KEY_STRING \ — 

KEY STRING KEY LENGTH @ DUMP 



Screen # 16 

\ SHOW_PLAIN INIT_KEY_LENGTHS 

: <SHOW_PLAIN> \ from -- 
DECRYPT 

PLAINTEXT + 320 DUMP ; 
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\ a screenfull pretty much 



: SHOW_PLAIN \ -- 
<SHOW_PLAIN> ; 

: INIT_KEY_LENGTHS \ -- \ sets BIG_KEY_LENGTHS as well 
KEY_SIZE DO I KEY_LENGTHS ZERO LOOP 
BIG KEY LENGTHS ZERO ; 



Screen # 17 

\ INPUT FILE OUTPUT FILE 
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INPUT_FILE \ filename buffer_ptr -- 

>R FILEl NAME>HCB R@ FILE_SIZE ERASE 

FILEl READ_ONLY FOPEN ABORT" can't open file for input." 
FILEl R> FILE_SIZE FREAD FILE_LENGTH ! 

FILEl FILE_MORE 1 FREAD ABORT" File is too big to load." 
FILEl FCLOSE ABORT" can't close file for input." ; 

OUTPUT_FILE \ filename buffer_ptr -- 
>R FILEl NAME>HCB 

FILEl WRITE_ONLY FMAKE ABORT" Can't open file for output." 

FILEl R> FILE_LENGTH @ FWRITE 

FILE_LENGTH @ < ABORT" Disk is full." 

FILEl FCLOSE ABORT" Can't close file for output." ; 



Screen # 18 

\ COUNT_COINCIDENCES FILL_PAST_CIPHER 
VARIABLE COIN_COUNT 
VARIABLE COIN SUM 
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: COUNT_COINCIDENCES \ cipher_ptrl cipher_ptr2 — percentage 

COIN_COUNT ZERO COIN_SUM ZERO 

BEGIN DUP PAST_CIPHER @ U< WHILE 

OVER C@ OVER C@ = IF COIN_SUM INC THEN 
SWAP 1+ SWAP 1+ COIN_COUNT INC REPEAT 

2 DROP 

COIN_SUM @ UNITY COIN_COUNT @ * / ; 
\ cipher_ptrl is < cipher_ptr2 

: FILL_PAST_CIPHER \ -- 

CIPHERTEXT FILE LENGTH @ + PAST CIPHER ! ; 



Screen # 19 

\ SEARCH_SIZE KEY_SEARCH_SIZE FILL_COINCIDENCES 12:08 05-30-99 

: SEARCH_SIZE \ — search_size 

FILE_LENGTH @ 3 / MAX_SEARCH_SIZE MIN ; 
\ We never shift less than one third of the file length. 
\ This value is empirically determined. 



: KEY_SEARCH_SIZE \ -- )<;ey_search_size 



be closer to 1 .0 for languages other than 
English, or by happenstance in short 
files. The MO S TEST character doesn't 
have to strictly be the most frequent, as 
long as it is very frequent. The reason is 
that, in our KEY CHAR array, we calcu- 
late the 256 best guesses for each char- 
acter in the key. We have various ways 
of filtering out the "best" guesses, if they 
aren't likely to be characters the 
encrypter would have used in his key. 

We have a two-dimensional array 
called KEY CHAR which we are going to 
fill. Row in the KEY_CHAR array will 
contain our best guess for what the key 
is. Row 1 is the second-best guess, and 
so forth. Let's first look at fill_key in 
screen 28, and then work our way back 
through the lower-level routines. 

FILL_KEY calls fill_freqs in 
screen 26 for each character of the key 
(column of KEY_CHAR). FILL_FREQS 
takes a pointer into ciphertext and 
increments through ciphertext by the 
key length. fill_freqs counts how 
many of each character is represented 
in ciphertext. fill_freqs is making 
this calculation as if for a Mono-alpha- 
betic Substitution cipher whose charac- 
ters just happen to be regularly spaced 
every key_length characters inside 
CIPHERTEXT. 

fill_key then calls column_fill_ 
key which will fill one column of 

KEY_CHAR. COLUMN_FILL_KEY calls 

single_fill_key in screen 27 for each 
row. single_fill_KEY takes the hori- 
zontal and vertical indices which it will 
be setting in key_char. single_fill_ 
key finds the cipher character in freqs 
which appears most often and assumes 
this must correspond to the mostest 
plain character. SINGLE_fill_key cal- 
culates what key character would have 
produced this cipher character, assum- 
ing that the plain character is the 
mostest character. This character is 
stored in key_char. single_fill_key 
returns this most-frequent cipher char- 
acter, the index into FREQS which 
pointed to the highest value. COLUMN_ 
FILL_KEY Stores a -1 value into this spot 
in FREQS before moving on to calculat- 
ing the next most likely character. This 
is done so single_fill_key doesn't 
find the same best value over and over. 

Screen 30 has the to_key_string 
routine. The author originally just cop- 
ied rowO of KEY_CHAR over to 
key_string. This needed some en- 
hancement. We were not taking into 
consideration that very few people are 
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going to have a key with unprintable 
characters in it. We want to filter these 
out. We have several ways of filtering 
out unwanted characters. TO_KEY_ 
STRING takes the cfa of a charjcind word 
(one of PRINTABLE, NUMERIC, SPANISH, 
UPPERCASE, LOWERCASE, ALPHA, ALPHA- 
NUMERIC, and PUNCTUATION). TO_KEY_ 
STRING searches down each column in 
KEY_CHAR and finds the first character 
in the char_kind class which TO_KEY_ 
STRING was given. Every column of 
KEY CHAR will hold every possible char- 
acter (each column has 256 entries), so 
we are bound to find something that 
satisfies our char_kmd requirement. In 
this way, we get the best guesses which 
are of some char_kind class. 

FILL_KEY_STRING does everything 
needed to determine the key contents. 

FILL_KEY_STRING USeS ALPHA as its 

default char_kind. fill_key_string is 
the word the user will type at the key- 
board in order to do phase two of the 
program. In practice, especially when 
cracking short files, FILL_KEY_STRING 
will provide an incomplete answer 
(some key characters are right and some 
are wrong). 

Interactive Guessing — Often needed 
on short files 

There are two ways for the user to 
deal with an incorrect KEY_STRING con- 
tent. One is to guess what the key string 
is, the other is to guess what the 
plaintext is. Often, by looking at the key 
string shown, the user can spot English 
words. If some characters seem wrong, 
look at the display of key_char above 
for that character's column. 

Scan down from the top to find a 
likely looking character. Use key_ 
STRING! to set key_string. Use 
SHOW PLAIN to see the resulting 
plaintext. The user can also use 
to_key_string with some other 
char_kind routine (followed by SHOW_ 
key string) to try various filters. We 
have lots of charjdnd routines. Note 
that encrypters sometimes are required 
to change their key every month. Of- 
ten, people pick a key which is always 
used and then append the two-digit 
month number (01 of January, etc.) on 
the end of it. Look for patterns like this. 

Back on screen 25, we had a word 
called TRY. After we have determined 
our key string we normally run 
show_plain to see what we have 
achieved. We may find that the result is 
recognizeable text, but that some of the 



SEARCH_SIZE KEY_SIZE MIN ; 

F I LL_CO incidences \ — \ coincidences within CIPHERTEXT 
FI LL_PAST_C I PHER 

SEARCH_SIZE 1 do \ minimum key length is 1 
CIPHERTEXT DUP I + C0UNT_C0INCIDENCES 
I COINCIDENCES ! LOOP ; 



Screen #20 

\ SHOW INDEX SHOW PERCENTAGE SHOW TABLE ENTRY 
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SHOW_INDEX \ index -- 
<# # #? #? #> TYPE 



: SHOW_PERCENTAGE \ percentage — \ 2 digits right of decimal 
10 / \ get rid of low digit 

<# # ASCII . HOLD # #? #_ #> TYPE ." " ; 

: SHOW_TABLE_ENTRY \ percentage index -- 
SHOW_INDEX SHOW_PERCENTAGE ; 

VARIABLE SHOW_FROM \ Starting index in PERCENTAGES 

48 CONSTANT SHOW_TOTAL \ total percentages shown 
8 CONSTANT SHOW ROW \ should be a denominator of SHOW TOTAL 



Screen # 21 

\ SHOW COINCIDENCES SHOW KEY LENGTHS 
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SHOW_COINCIDENCES \ from — \ show SHOW_TOTAL at FROM 
SHOW_FROM ! CR 

SHOW_FROM @ SHOW_TOTAL + SEARCH_SIZE MIN SHOW_FROM @ ? DO 
I COINCIDENCES @ I SHOW_TABLE_ENTRY 
I 1+ SHOW_FROM @ - SHOW_ROW MOD 0= IF CR THEN 
LOOP ; 

SHOW_KEY_LENGTHS \ — \ show them all 
CR KEY_SIZE 1 DO 

I KEY_LENGTHS @ I SHOW_TABLE_ENTRY 
I SHOW_ROW MOD 0= IF CR THEN 
LOOP 

CR ." too big = " BIG_KEY LENGTHS @ SHOW PERCENTAGE ; 



Screen # 22 

\ CALC THRESHOLD 
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VARIABLE COIN_MIN \ smallest value found in COINCIDENCES 
VARIABLE COIN_MAX \ largest value found in COINCIDENCES 

: CALC_THRESHOED \ — threshold \ midpoint of COINCIDENCES 

100 COIN_MIN ! COIN_MAX ! 

SEARCH_StZE 1 DO I COINCIDENCES @ 

DUP COIN_MIN @ < IF DUP COIN_MIN ! THEN 
DUP COIN_MAX @ > IF DUP COIN_MAX ! THEN 
DROP LOOP 

COIN MAX @ COIN MIN @ - 2/ COIN MIN @ + ; 



Screen # 23 

\ FILL KEY LENGTHS 
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<FILL_KEY_LENGTHS> \ distance_f rom_last_spike -- 
DUP KEY SIZE < IF \ within key 
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KEY LENGTHS INC 



ELSE 

DROP BIG_KEY_LENGTHS INC THEN ; 

FILL_KEY_LENGTHS \ — spike_count 
\ spike_count last_spike -- 
SEARCH_SIZE 1 DO 

I COINCIDENCES @ THRESHOLD @ U> IF \ found a spike 
I SWAP - <FILL_KEY_LENGTHS> 
1+ I THEN \ spike_count last_spike — 

LOOP 

0= ABORT" We found no spikes at all!" ; 



Screen # 24 

\ KEY LENGTHS % CALC KEY LENGTH 
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: KEY_LENGTHS% \ spike_count — \ change to percentages 
KEY_SIZE 1 DO 

I KEY_LENGTHS @ UNITY ROVER */ I KEY_LENGTHS ! 
LOOP 

BIG_KEY_LENGTHS @ UNITY ROT * / BIG_KEY_LENGTHS ! ; 

: CALC_KEY_LENGTH \ -- length 

INIT_KEY_LENGTHS FILL_KEY_LENGTHS KEY_LENGTHS% 
\ max_key_length -- 
KEY_SIZE 1 DO 

I KEY_LENGTHS @ OVER KEY_LENGTHS 9 > IF 

DROP I THEN 
LOOP ; 

\ CALC KEY LENGTH uses the lower index if two have = values. 



Screen # 25 

\ FILL KEY LENGTH TRY 
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FILL_KEY_LENGTH \ — 

FILL_COINCIDENCES 1 SHOW_COINCIDENCES 
CALC_THRESHOLD THRESHOLD ! 

CR ." threshold = " THRESHOLD @ SHOW_PERCENTAGE 
CALC_KEY_LENGTH KEY_LENGTH ! SHOW_KEY_LENGTHS 
CR ." Key length is: " KEY_LENGTH @ . ; 

TRY \ plain_char horz_index vert_index -- 

(Figure Two — source code — continues on page 24.) 



Figure Four. 

AMENDMENT 4 . 

The right of the people to be secure in their persons, 
houses, papers, and effects, against unreasonable searches 
and seizures, shall not be violated, and no warrents shall 
issue but upon probable cause, supported by oath or 
affirmation, and particularly describing the place to be 
searched, and the persons or things to be seized. 



Figure Five. 



MESSAGE.TXT" PLAINTEXT INPUT_FILE 
Very-Personal" KEY STRING! ENCRYPT 



characters are wrong. These wrong char- 
acters correspond to erroneous charac- 
ters in KEY_STRING. Fixing this inter- 
actively is what TRY is for. 

TRY takes a plain character, and a 
horizontal and vertical index into 
PLAINTEXT. We are hoping this plain 
character is what should go in that spot 
in PLAINTEXT. The reason we have a 
horizontal and vertical index into 
PLAINTEXT is that the DUMP in 
SHOW_PLAiN displays plaintext as a 
two-dimensional array. We are, presum- 
ably, using TRY after using SHOW_PLAiN 
while we are looking at SHOW_PLAiN's 
output. TRY fixes the corresponding 
character in KEY_STRING and reruns 
SH0W_PLAIN. We can TRY another char- 
acter, or we can stop if our plain text 
looks correct. This is quite similar to the 
Jeopardy game, in which a person looks 
at a plaintext message with some of the 
characters missing and tries to guess 
what those characters are. When the 
plaintext appears to be correct, execute 
SHOW_KEY_STRiNG to find out what key 
TRY has built. 

An Example Run — The program from 
the user's persective 

We are done with our study of the 
encryption-cracking program. Let's run 
through an example. The reader should 
enter the text in Figure Four exactly, and 
save it in a file called Message.txt. Be 
careful to put the end-of-lines at the 
same places so that the program will give 
the exact same results we will describe 
here. Message.txt should have a length 
of 354 characters. 

Execute the code shown in Figure 
Five in order to fill CIPHERTEXT with en- 
crypted data. Now pretend you don't 
know what the plaintext is or what the 
key is, and try to crack the cipher. First 
execute fill_key_length. This will re- 
sult in an output as shown in Figure Six. 
It seems clear that the key length is 13, 
since there is a 60% chance this is true. 
We have a 20% chance of it being 26, 
and a 20% chance of it being 52. Note 
that both 26 and 52 are multiples of 13. 
Take a glance over the COINCIDENCES 
data at the top and note that 13 has a 
value of 9.9%, which is considerably 
higher than the other values. This is defi- 
nitely a spike. 

Execute fill_key_string. This will 
result in an output as shown in Figure 
Seven. The program has found 
"VeryePersonal". This looks good, except 
for that 'e' after "Very". Look at the 
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KEY CHAR data shown at the top of the 
screen. Scan down the fifth column. The 
top character is 'e' and the second best 
one is The hyphen seems likely. Ex- 
ecuting the code " Very-Personal" 
KEY_STRING ! SHOW_PLAIN Will Show 

that this is the correct key. An alterna- 
tive to scanning down columns in 
KEY CHAR would be to use the char_kind 
filters. It seems clear there must be some 



some characters wrong. For example, on the sixth row we see a word "pa8ers". We 
can guess that this is supposed to be the word "papers". Execute ASCII p 2 5 TRY 
to try out a 'p' in place of that '8'. Note that we are using a horizontal index of 2, 
since we start counting at zero. We are also using a vertical index of 5, since we 
count the rows from the top down, starting at zero. TRY automatically executes 
SHOW_PLAiN after fixing its KEY_STRING character so the user can see the result. 
Sometimes it is necessary to use try several times to fix several characters (or to fix 
one character, if you're not sure what it should be). When the plaintext looks 
correct, use SH0W_KEY_STRING to find what key you built with your various TRY 
executions. 
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(Figure Two - source code - continued.) 

16 * + >R R@ KEY_LENGTH @ MOD KEY_STRING 
R> CIPHERTEXT + C@ \ plain_char key_ptr cipher_char — 
ROT GET_KEY SWAP C! 
SHOW_PLAIN ; 
\ TRY assumes PLAINTEXT is paragraph aligned. 

\ TRY acts like PLAINTEXT is a 16 wide 2d array (as DUMP shows) 



Screen # 26 

\ INIT_FREQS FILL_FREQS 

: INIT_FREQS \ -- 

CHARS DO I FREQS ZERO LOOP ; 



: FILL_FREQS \ cipher_ptr — \ steps by KEY_LENGTH 
INIT_FREQS 

PAST_CIPHER @ SWAP DO 
I C@ FREQS INC 
KEY LENGTH @ +LOOP ; 



12:33 05-30-99 



Screen # 27 

\ BEST CIPHER CHAR SINGLE FILL KEY 



20:10 05-30-99 



: BEST_CIPHER_CHAR \ — best_cipher_character 

-1 -1 \ best_cipher_char best_cipher_char_occurances -- 
CHARS DO \ I is the test character 
I FREQS @ OVER > IF 2DR0P 
I I FREQS @ THEN 

LOOP 

-1 = ABORT" FREQS was corrupt" ; 
: SINGLE_FILL_KEY \ horz_index vert_index — best_cipher_char 

(Figure Two - source code - continues on next page.) 

Figure Eight. 



Final Thoughts — PolySub encryption 
is a toy algorithm 

Try the program using different key 
sizes. Try it with "SUPERCALIFRAGI- 
LISTIC" for a difficult exercise, and with 
"UNIQUE" for an easy exercise. Try it 
using a key containing mixed upper- 
case, lowercase, numbers, and so forth. 
It is kind of fun to crack ciphers with 
the program; it can be like solving a 
puzzle. Readers may also find it enjoy- 
able to beef up CrakPoly in various 
ways. There are enhancements which 
would make CrakPoly better at crack- 
ing very short ciphers, though it is al- 
ready quite good. Our Message.txt file 
was only 354 bytes, and CrakPoly 
cracked it with ease. The best enhance- 
ment would be to get rid of try's need 
for numeric coordinates into 
PLAINTEXT. Entering these is tedious 
and error-prone. We would want try to 
allow the user to move a cursor around 
in the plaintext with his arrow keys. 
When he got his cursor over an offend- 
ing character, he would type the correct 
character and TRY would fix the key 
string and display a regenerated 
PLAINTEXT. 

It is hoped that the reader has found 
our discussion of CrakPoly to be inter- 
esting. There might be a few readers who 
have a practical need for it. An example 
would be a company owner who could 
write a PolySub program and give it to 
his employees, saying, "Use this on all 
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sensitive documents to prevent corpo- 
rate espionage." Many of the employ- 
ees will use it on personal documents 
they are storing on company comput- 
ers. These, of course, are what the com- 
pany owner was actually interested in. 
For the most part, however, CrakPoly is 
just a toy program without any commer- 
cial prospects. 

CrakPoly could only be written in 
Forth, and it would never have been 
written in C++. The reason is that 
CrakPoly is necessarily interactive, with 
TRY and TO_KEY_STRING and 
KEY_STRING ! and SO forth. To write a 
GUI that would achieve this level of 
interactiveness would be more work 
than would be justified for a toy pro- 
gram. All commercial products these 
days have GUI interfaces, and C++ is 
oriented towards writing GUIs. C++ does 
not have any ready facility for execut- 
ing commands from the keyboard. The 
author has used LEX/YACC under C++ 
to provide programs with a command- 
line interface. This is a powerful tech- 
nique, but it also requires a lot of work. 
In Forth, the command-line interface is 
free. In general, a person who only 
knows C++ would have to decide that 
CrakPoly requires more work than it is 
worth, and would never start the 
project. This would be a shame, because 
CrakPoly does have some value. 

The author found that writing 
CrakPoly was fun, and that using it is fun, 
too. Also, designing and writing fun pro- 
grams is good practice for working on 
commercial products. C++, with its em- 
phasis on GUIs and commercial develop- 
ment, requires too much work to be used 
in weekend projects. Because nobody pro- 
grams as a leisure activity anymore, in so 
doing getting practice at programming, 
our professional programming is now 
described with terms like "death march 
project" and "anti-pattern." These appar- 
ently are the wages of professionalism. 

In case any reader has been using the 
PoIySub to encrypt anything of value, 
this article should dissuade him. Per- 
haps, in the future, we can delve into 
writing an encryption program which 
does provide good security. In the mean- 
time, the reader is encouraged to use 
PGP, which provides good security and 
is a standard method of encryption. It 
is good to have a standard so that ev- 
erybody can easily exchange encrypted 
files with one another. Standardizing on 
the PolySub because it is well-known, 
however, would be a mistake. 



(Figure Two - source code - continued.) 

KEY_CHAR >R BEST_CIPHER_CHAR 
DUP MOSTEST C@ GET KEY R> C! 



Screen # 28 

\ COLUMN FILL KEY FILL KEY 



12:34 05-30-99 



COLUMN_FILL_KEY \ horz_index — 

CHARS DO \ I is the vert_index 

DUP I SINGLE_FILL_KEY \ horz_index best_cipher_char — 
FREQS -1 SWAP ! \ won't be the best of next vert_inclex 
LOOP 

DROP ; 

FILL_KEY \ -- 

FILL_PAST_CIPHER 

KEY_LENGTH @ DO \ I is horz_index 
CIPHERTEXT I + FILL_FREQS 
I COLUMN_FILL_KEY 
LOOP ; 



Screen #29 



\ 



SHOW_KEY SHOW_KEY_HEX 
SHOW_KEY \ -- 
CR 

SHOW_KEYS DO 
KEY_LENGTH @ DO 

I J KEY_CHAR C@ 
EMIT ELSE 



19:18 05-29-99 



\ J = vert_index 
\ I = horz_index 

DUP PRINTABLE IF 
DROP NON CHAR EMIT 



THEN 



SPACE LOOP CR LOOP 



SHOW_KEY_HEX \ — 
CR BASE @ >R HEX 
SHOW_KEYS DO 
KEY_LENGTH @ DO 

I J KEY_CHAR C@ 
LOOP CR LOOP 
R> BASE ! ; 



\ J = vert_index 
\ I = horz_index 
<# # # BL HOLD #> TYPE 



Screen # 30 

\ TO_KEY_STRING FILL_KEY_STRING 
: <TO_KEY_STRING> \ -- 

KEY_LENGTH @ DO \ J 
I KEY_STRING C! 
CHARS DO \ I 

J I KEY CHAR C@ 



19:47 05-30-99 



is the horz_index 

\ default 
is the vert_index 
DUP 'CHAR KIND PERFORM IF 



J KEY STRING C! LEAVE ELSE DROP THEN 



LOOP 



LOOP ; 

TO_KEY_STRING 
'CHAR KIND ! 



\ char_kind_cf a - 
<T0 KEY STRING> 



FILL_KEY_STRING \ — 
FILL_KEY SH0W_KEy 
[ '] ALPHA TO KEY STRING 



SHOW KEY STRING 



Forth Dimensions XX.5,6 



25 



STANDARD FORTH TOOL BELT - #08 



PRESWOOP 



Rick VanNorman took my Simple Object-Oriented Pro- 
gramming and extended it. It is much more powerful. Be- 
cause of the extra power, it is no longer a simple implemen- 
tation, but it is still easy to use and fast. 

Rick implemented SWOOP for SwiftForth using general- 
purpose SwiftForth words. It is an easy task to define these 
general-purpose words in Standard Forth. With that prelude, 
SWOOP becomes available for Forths conforming to Standard 
Forth. I have been using Swoop in my work since the begin- 
ning of the year. 

If you already have definitions for these words with the 
same meaning, you should comment out those definitions 
here — especially when your definitions are more efficient. 

There are two problems not handled by Standard Forth, 
l.ln extending the set of classes, using marker may corrupt 
the list. In SwiftForth, PowerMacForth, and probably 
others, CHAIN name cooperates with MARKER to discard 



tokens that would cause trouble. 
2. ANS Forth specifies word list identifiers as "implementa- 
tion-dependent single-cell values that identify word 
lists," which is the weakest possible specification, mean- 
ing you know nothing about them. ANS Forth also 
ignores saving the system after compiling new defini- 
tions, and then reloading the system with a possible 
relocation of addresses. 

Some systems, such as PowerMacForth, define a word list 
identifier (wid) so that it is valid only in the session in which 
it's defined. To provide maintenance and transition, 
WORDLIST: should provide, in such systems, named word 
list identifiers that can be used across sessions. The defini- 
tion of WORDLIST: here is for implementations without a 
problem with word list identifiers. 



[ IF] ****************************************************** 
ANS Prelude for SWOOP 



All these definitions are generally useful. 



All code was checked m PowerMacForth and SwiftForth W l Ba'an, after 
years of profane language, retired to Standard Forth For the source for 
this article, send email requesting Standard Forth Tool Belt MS-'PRtSWOOR' 



Comment out definitions with the same meaning that you 
already have. 

{ 

CELL CELL- /ALLOT ?EXIT -EXIT !+ @+ STRING, 
CHAIN RELINK, >LINK 
-ORDER +ORDER 
CREATE-XT WORDLIST: 

***************************************** ^^* ********* * |- 'pf^gjq] 
[ IF] ^ 

{ begins a comment that may extend over multiple lines 
until a terminating right brace } is encountered. ( — ) 

This definition is first so it can be used henceforth. 



Wil Baden • wilbaden@netcom.com 
Costa Mesa, California 
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[ THEN] 

: NOT 0= ; 

: { ( "ccc . . .}" 

BEGIN SOURCE + 

[ CHAR] } PARSE + > NOT WHILE 
REFILL 0= UNTIL THEN ; IMMEDIATE 

{ 

CELL CELL- /ALLOT ?EXIT -EXIT !+ @+ STRING, 



— ) 
( addr) 
( ) 



CELL and CELL- are convenient for address arithmetic. 
/ALLOT allots and clears dataspace. 
?EXIT is IF EXIT THEN 
-EXIT is 0= IF EXIT THEN 

@+ fetches the value x from addr, and increments the address 
by one cell. ( addr -- addr+4 x ) 

!+ writes the value x to addr, and increments the address by 
one cell. ( addr x — addr-i-4 ) 

STRING, compiles the string at addr, whose length is u, in the 
dictionary starting at HERE, and allocates space for it. 

( addr u — ) 

These are all in SwiftForth, PowerMacForth, and others. 



1 CELLS CONSTANT CELL 
: CELL- CELL - ; 

: /ALLOT ( n — ) HERE SWAP DUP ALLOT ERASE ; 

: ?EXIT ( n -- ) \ IF EXIT THEN 

POSTPONE IF POSTPONE EXIT POSTPONE THEN ; IMMEDIATE 

: -EXIT ( n — ) \ 0= IF EXIT THEN 

POSTPONE 0= POSTPONE IF POSTPONE EXIT POSTPONE THEN ; 
IMMEDIATE 

: !+ ( addr n — addr+CELL ) OVER ! CELL+ ; 

: @+ ( addr -- addr+CELL n ) DUP CELL+ SWAP @ ; 
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STRING, ( str len — ) 

HERE OVER 1+ CHARS ALLOT 2DUP C! CHAR+ SWAP MOVE ; 



CHAIN RELINK, >LINK 



For relocation of machine addresses, they are referenced 
self -relative . 

CHAIN <name> defines the head of a linked-list of addresses. 
The list must be pruned when elements are forgotten. 
In SwiftForth and PowerMacForth this will be done for you. 

( "name" — ) 

RELINK, takes a link from one list and installs it in the 
current list. ( addr -- ) 

>LINK adds a link starting at HERE to the top of the linked 
list whose head is at addr (normally a variable) . The head 
is set to point to the new link, which, in turn, is set to 
point to the previous top link. ( addr -- ) 



: CHAIN ( "name" -- ) CREATE , ; 

: RELINK, ( a — ) DUP @ DUP IF OVER + HERE - THEN , DROP ; 
: >LINK ( a — ) ALIGN HERE OVER RELINK, OVER - SWAP ! ; 



{ 



-ORDER +ORDER 



-ORDER removes a word list from the context, wherever it is. 

( wid — ) 



+ORDER puts a word list in the context at the top. 



( wid -- ) 



: -ORDER ( wid ) 

>R GET-ORDER ( vidn 

DUP BEGIN DUP WHILE ( widn 

DUP 1+ PICK ( widn 

R@ = IF ( widn 
DUP 1+ ROLL DROP 
>R 1- R> 

Toolbelt #8 code continues on page 49. 



. widl n) ( R: wid) 

. widl n i) 

. widl n i widi) 

. widl n i) 
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SWOOP: 

Object-Oriented Programming in SwiftForth 


Wil Baden kindly introduced my object implementation 
in the preceding issue of Forth Dimensions. Here I will attempt 
to present the details of its operation. 

1. Origins and motivations 

Prior to embarking on this project, I evaluated several Forth 
OOP implementations: Baden[l], Ertl[4], McKewan[5], and 
Fountain [6]. None entirely met my requirements. 

The first consideration I faced was the order of the object/ 
message tuples. The two fundamental flavors of this syntax 
are message-object and object-message. Both have existing imple- 
mentations, pros and cons, supporters and detractors. I de- 
cided on object-message because it more closely paralleled the 
Forth piogiamming paradigm. It also has the benefit, in nested 
object definitions, of progressing from the general to the spe- 
cific, or from the collection of data to the individual datum. 

My second consideration was whether to have the compo- 
nents of a class parse or not. In most of the object-oriented 
Forths I studied, each entity parses its successor and determines 
what the phrase means. Ertl objected strongly to this as limit- 
ing the usefulness and extensibility of the messaging model — 
making it difficult to pass messages on the Forth stack — and as 
imposing an artificial dependency on the adjacency of oper- 
ands. I agree with this analysis, and developed a syntax almost 
completely independent of parsing requirements. 

The third consideration was that the class model had to 
provide for encapsulation and information hiding. This is ap- 
parently an absolute requirement if an object model is to be 
taken seriously. Some existing systems provide this, others 
do not. 

All these features were implemented to one degree or an- 
other in the various systems 1 evaluated. But none addressed 
my fourth consideration: the need for the generated code to 
be target-compilable. This reduces to the need for the com- 
pile and interpret behaviors and structures to be fully sepa- 
rate from, and independent of, the run-time code. 

2. Basic SWOOP Components 

2. /. Defining a simple class 

POINT (defined below) is a simple class 1 have been using 
as my primary building-block example for SWOOP. It dem- 
onstrates two of the four basic class member types: data and 
colon. 

The word following CLASS is the name of the class; all defi- 
nitions between class and END-CLASS belong to it. These 
definitions are referred to as members of the class. When a class 
name is executed, it leaves its handle (xt) on the stack. The 
constructor words are the primary consumers of this handle. 


CLASS POINT 

VARIABLE X 
VARIABLE y 

: SHOW ( — ) X @ . Y @ . ; 
: DOT ( -- ) ." Point at " SHOW ; 
END-CLASS 

The class definition itself does not allocate any instance 
storage; it only records how much storage is required for each 
instance of the class. VARIABLE reserves a cell of space and 
associates it with a member name. 

The colon members show and dot are exactly like normal 
Forth colon definitions, except they are only valid in the ex- 
ecution context of an object of tj^e point, x and y also be- 
have exactly like normal Forth variables except for being 
valid only within the scope of a point object. 

There are four kinds of members: 

1. Data members, including all data definitions. Available 
data member defining words include CREATE (normally 
followed by data compiled with , or c , ), buffer : (an 
array whose length is specified in address units), VARI- 
ABLE, CVARIABLE (single char), or constants; 

2. Colon members, normal colon definitions that may act 
on or use data members; 

3. Deferred members, colon-like definitions with a default 
behavior that can be referenced while defining the class, 
but may have substitute behaviors defined by sub-classes 
defined later; 

4. Other objects. 

The deferred members allow for polymorphism and late 
binding, and will be discussed later. 

2.2. Static instances of a class 

Having defined a class, we can create an instance of it. 
BUILDS is the static instance constructor in SWOOP; it is a 
Forth defining word and requires the handle of a class on the 
stack when executed. 
POINT BUILDS ORIGIN 

This defines a static object of class point named origin. 
^ Now, any of the members of PC iNT may be referenced in the 
context of this object. For example: 

5 ORIGIN X ! 
8 ORIGIN Y ! 
ORIGIN DOT 

When the name of an object is executed, two things hap- 
pen: first, the Forth interpreter's context is modified to in- 
clude the namespace of the class that created it. Second, the 
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Some OOP Terminology 



class A generalized specification for objects that will share 
common data structures and methods. 

deferred member In SSWOOP, a method that is subject to 
late binding. In C++, this is referred to as a virtual method. 

early binding Resolving references to functions statically 
at compile time (the normal behavior of compilers) . This gives 
the best performance, but is less flexible than late binding, 

encapsulation Combining data and methods into a single 
package that responds to messages. 

information hiding The ability of an object to possess data 
and methods that are not accessible outside its class. 

inheritance The ability to define a new class based on a 
previously defined ("parent") class, and to have the new class 
automatically possess all members of the parent. It may add 
to or replace these members, or define actual behaviors for 
deferred members. 

instance An object constructed according to a class specifi- 
cation. An instance is to its class as a building is to its blue- 
prints. 

instance data The data structures within an instance. 

late binding The ability of an object to resolve references 
to functions dynamically at run time, based on the message 
sent to it. This is extremely flexible, but is inevitably slower 
than early binding. 

member In C++, a data field in an object; in SWOOP, mem- 
bers include both data fields and methods. 

message Data passed to an object for the purpose of re- 
questing it to execute one of its methods. 

method A function performed by an object in response to 
a message. 

namespace In SWOOP, the names of members recognized 
in a particular object class, including its locally defined mem- 
bers in addition to those inherited from parent classes. 

object A packaged combination of data and methods. ^ 

object-oriented programming (OOP) A programming sys- 
tem that features encapsulation, information hiding, poly- 
morphism, and late binding. 

polymorphism The ability of different sub-classes of a class 
to respond to the same message in different ways. For ex- 
ample, all vehicles can steer, but bicycles do it differently from 
automobiles. 



address of the object is placed on the stack. The phrase 

ORIGIN 2 CELLS DUMP 

is perfectly valid (assuming you have a suitable DUMP func- 
tion). Each of the members of the class act on this address. 
Members that represent data simply add an offset to it; mem- 
bers that are defer or colon definitions push the address into 
' SELF (which holds the current object address) before ex- 
ecuting, and restore it afterwards. 

2.3. Dynamic objects 

We can also create a temporary context in which to refer- 
ence the members of a class. USING parses the word follow- 
ing it and, assuming that it is the name of a class, makes its 
members available for use on data at a specified address. For 
instance, I can place data at pad and use the members of 
POINT to act on it: 
6 PAD ! 9 PAD CELL+ ! 
PAD USING POINT DOT 

This will print 6 and 9. It is a very simple, user -managed 
dynamic instance of a class. It is also, generally, not a good 
way to use dynamic objects. 

A better idea is to let SWOOP manage dynamic instances 
for you. NEW is the dynamic constructor. It is not a defining 
word, but is a memory management word similar to ALLO- 
CATE. It requires a class handle on the stack, and returns an 
address. When the object is no longer needed, it can be dis- 
posed of with DESTROY. 
VALUE FOO 
POINT NEW TO FOO 
8 FOO USING POINT X ! 
99 FOO USING POINT Y ! 
FOO USING POINT DOT 
FOO DESTROY TO FOO 

This example uses FOO to keep up with the address of an 
instance of POINT. After the instance is created, it may be 
manipulated (with a slight change in syntax) in the same 
way a static instance of point is. When it's no longer needed, 
the instance is destroyed and the address kept in FOO is in- 
validated. 

Objects created by new do not exist in the Forth dictio- 
nary, and must be explicitly destroyed when no longer used. 

Another form of dynamic object instantiation is local ob- 
jects. These, hke local variables, are available only inside a 
single colon definition, and are instantiated only while the 
definition is being executed. Here's an example: 
: TEST ( -- ) 

[ OBJECTS POINT MAKES JOE OBJECTS] 
JOE DOT ; 

You can define as many local objects as you need between 
[ OBJECTS and OB JECTS] . They will all be instantiated when 
TEST is executed, and destroyed when it is completed. This 
is a particularly useful facility in Windows programming, 
because these objects can be used in Windows callback rou- 
tines. Unfortunately, local objects cannot be implemented 
straightforwardly in ANS Forth, as that depends heavily on 
internal SwiftForth implementation features, so they are not 
included in the released code. 
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2.4. Embedded objects 

Pre-defined classes may be used as members of other 
classes. The syntax for using one is the same as for defining 
static objects. These objects are not static; they will be con- 
structed only when their container is instantiated. 
CLASS RECTANGLE 

POINT BUILDS UL 
POINT BUILDS LR 

: SHOW ( -- ) UL DOT LR DOT ; 
: DOT ( — ) ." Rectangle, " SHOW ; 
END-CLASS 

In this example, the points giving the upper-left and lower- 
right corners of the rectangle are instantiated as POINT ob- 
jects. The members of rectangle may reference them by 
name, and may use any of the members of POINT to manipu- 
late them. In this example, show references the dot member 
of POINT to print UL and LR; this member is «of the same as 
the DOT member of RECTANGLE. 

These embedded objects are exactly like data allocations 
in the class: they simply add an offset to the base address of 
the object's data when referenced. There is nothing special 
about creating an instance of such a class, but the created 
object has all public members of the embedded objects avail- 
able as well. 

2.5. Information hiding 

Classes are composed of named members. Thus far, all the 
members have been visible in any reference to the class or an 
object of the class. Even though the member names are hid- 
den from casual reference by the user, the information-hid- 
ing requirements of object-oriented programming are more 
stringent. 

The accepted level of information hiding in OOP seems 
to be that classes must have at least the ability to hide mem- 
bers from any external access. SWOOP accomplishes this via 
three keywords: 

• PUBLIC identifies members that can be accessed globally. 

• PROTECTED identifies members that are available only 
within the class in which they are defined, and in its sub- 
classes. 

• PRIVATE identifies members that are available only 
within the defining class. 

When a class definition is begun, all member names de- 
fault to being public, which is to say visible outside of the 
class definition. PRIVATE or protected changes the level of 
visibility of the members. 
CLASS POINT 
PRIVATE 

VARIABLE X 

VARIABLE Y 

: SHOW ( -- ) X 8 . Y @ . ; 

PUBLIC 

: GET ( -- X y ) X @ Y @ ; 
: PUT ( X y — ) Y ! X ! ; 
: DOT ( — ) ." Point at " SHOW 
END-CLASS 



In this definition of point the members X, Y, and SHOW are 
now private, available to local use while defining POINT and 
hidden from view afterwards. Since a point is relatively useless 
unless its location can be set and read, members which can do 
this are provided in the public section. However, these defini- 
tions achieve the desired level of information hiding: the ac- 
tual data storage is unavailable to the user and may only be 
accessed through the members provided for that purpose. 

2.6. Inheritance and polymorphism 

Inheritance is the ability to define a new class based on an 
existing class. The new sub-class, which initially has exactly 
the same members as its parent, can replace some of the in- 
herited members or can add new ones. If the subclass rede- 
fines an existing member, all further use within the subclass 
will reference the new one; all prior references were already 
bound and continue to reference the previous member. 

Polymorphism goes a step further than inheritance. In it, a 
new subclass inherits all the members of its parents, but may 
also redefine any DEFER: members of its parents. 

For example, our previous example could be written this 
way: 

CLASS POINT 
VARIABLE X 
VARIABLE Y 

DEFER: SHOW ( -- ) X @ . Y @ . ; 
: DOT ( -- ) ." Point at " SHOW ; 
END-CLASS 

Then you could make a sub-class like this: 
POINT SUBCLASS LABEL-POINT 
: SHOW ( -- ) 

." X" X @ . ." Y" Y @ . ; 

END-CLASS 

LABEL-POINT BUILDS POO 
POO DOT 

The original definition dot in the parent class POINT will 
still reference SHOW, but when it is executed for an instance of 
LABEL- POINT, the new behavior will automatically be substi- 
tuted, so POO DOT will print the labeled coordinates. 

3. Data Structures 

This section will describe the basic data structures involved 
in classes and members, as a foundation for discussing more- 
detailed implementation strategies underlying SWOOP. • 



Figure One. Structure of a class 
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3.1. Classes 

The data representation of a class is shown in Figure One. 
Each class is composed of a eight-cell structure. All classes are 
linked in a single list that originates in the list head CLASSES. 
This allows the system or user to see all created classes, and 
will be used in the future to facilitate the implementation of 
a class browser. 

Each class has a unique handle. When executed, a class 
name will return this handle. The handle also happens to be 
the that is returned by ticking the class name. For example, 
if POINT is a class, then 
' POINT . 

prints the same value as 
POINT . 

Each class (except supreme) has a superclass. By default, it 
is supreme, but a class can be a child of any pre-existing class. 
The value in the Super field is the handle (xt) of the superclass. 

Classes are composed of members, divided into three lists — 
public, protected, and private — which are identical except for 
their visibility to external references. Each list has a head in 
the class data structure. With inheritance, these lists may chain 
back into its superclass, and its superclass, etc., all the way 
back to supreme. The ordering within the chain is such that 
the head points to the last (most recently defined) member, 
which is linked to the next most recently defined, etc. This 
is the same ordering as within a Forth dictionary, and allows 
for redefinitions. These lists, in conjunction with the class 
handle and the wordlist members, define the class namespace. 

The size field represents the size (in bytes) required by a 
single instance of the class. This value is the sum of all ex- 
plicitly referenced data in the class itself plus the size of its 
superclass. 

The class tag is a simple constant used to identify the data 
structure as a valid class. 



Figure Two. Basic structure of a member 
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Figure Three. Data structures for various nnember types 
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A class definition is begun by CLASS or subclass and is 
ended by end-class. While a class is being defined, the nor- 
mal Forth interpreter/compiler is used; its behavior is modi- 
fied by changing the search order to include the class 
namespace and the wordlist CC-WORDS. 

All links in this system are relative, and all handles are 
execution tokens (xt). This is the only way I have found to 
generate a system I could guarantee to be portable across many 
different ANS Forth platforms. In the general case, this re- 
sults in data structures that are relocatable. Specifically, in 
SwiftForth, this means that the objects created in the interac- 
tive system at a given address will work when saved as a DLLs, 
which are loaded arbitrary addresses by the operating system. 

3.2. Members 

Members are defined between CLASS and end-class. They 
parallel the basic Forth constructs of variables, colon-defini- 
tions, and deferred words. The definition of a member has 
two parts. First is the member's name, which exists in the 
wordlist members. The xt of this name is used as the member 
id when it is referenced. Second is the member's data struc- 
ture. This contains information about how to compile and 
execute the member. Each member is of the general format 
shown in Figure Two; the specific format of some member 
types is shown in Figure Three. 

The data structure associated with a member has five fields: 
member compiler, link, message id, member run time, and 
data. The data field is not of fixed length; its content de- 
pends on the programmatic expectations of the compiler and 
run-time routines. 

The compiler xt is the early binding behavior for members, 
and the run-time xt is the late binding behavior. Each variety 
of member has a unique compiler xt and run-time xt; both ex- 
pect the address of the member's data field on the stack when 
executed. The message id in each entry is the xt given by the 
member's name in the MEMBERS wordlist. 

The data field contents vary depending on the type of 
member the structure represents. For data members, the data 
field contains the offset into the current object. For colon 
members, it contains the Forth xt which is executed to per- 
form the actions defined for the member. In defer members, 
the data field also contains an xt, but it is only used if the 
defer is not extended beyond its default behavior. The data 
field of colon members contains the actual Forth xt to be 
compiled when the method is referenced. In object mem- 
bers, the data field contains both the offset in the current 
object of the member and the class handle of the member. 

4. Implementation Strategies 

Having discussed the basic syntax and data structures in- 
volved in SWOOP, we can now consider the underlying 
mechanisms in the system. 

4. 1. Global state information 

SWOOP depends on two variables for its behavior during 
compilation and execution. 'THIS contains the handle of 
the active class, and ' SELF has the active object's data ad- 
dress. The system provides words to set, save, and restore these 
variables. See the section on system variables in Listing One. 
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In SwiftForth, these are implemented as user variables so that 
object code is re-entrant. 

SWOOP maintains two wordlists associated with the com- 
pilation of classes and objects, members contains the list of 
unique identifiers used to name the members of classes, and 
CC-WORDS contains the compiler words used to construct the 
definitions of the members of classes. 

4.2. Classes and member identifiers 

In other OOP implementations, classes are composed of 
instance data, methods that can act on the data, and mes- 
sages corresponding to these methods that can be sent to 
objects derived from the class. 

In SWOOP, instance data and methods are combined into 
a single orthogonal concept: members. Each member has a 
unique identifier which can be used as a message. Members 
exist as created names in the members wordlist; each member's 
xt is its identifier. A given name will exist only once in mem- 
bers; a member name always corresponds to the same iden- 
tifier (i.e., xf), regardless of the class or context in which it is 
referenced. 

Classes are composed of members organized in the public, 
protected, and private lists. The structure of a class is shown in 
Figure One. The member lists of a class are based on switches 
(VanNorman [7]) and use a member identifier as a key. A class 
doesn't know the names of its members, only their identifiers. 

4.3. Compilation strategy 

The two common models of object systems in Forth seem 
to be mutually exclusive: one parses and has encapsulation, 
the other doesn't parse but lacks information hiding. 

The main strengths of the parsing model are encapsulation 
and information hiding. This is achieved by each word being 
immediate — it always executes, and it parses the next word 
instead of allowing the Forth interpreter to do so. This is how 
the context for the next word is enforced; it contains an im- 
plied search order change at each token of a multi-word phrase. 
An unpublished implementation by Charles Melice achieves 
information hiding via wordlists; each word parses and ex- 
plicitly searches for its successor in a class-unique wordlist. 

The main strength of the non-parsing model is its gener- 
ality. Code simply pushes object addresses on the stack, modi- 
fies them, then eventually acts on these addresses. Each to- 
ken is standalone, not knowing or caring what produced its 
input or what consumed its output. All names exist in the 
primary system wordlist. 

The epiphany was my realization that the strengths of these 
models did not contradict each other. The SWOOP model is a 
synthesis of these two strengths. The result of this interplay of 
ideas is the namespace. A class's namespace is defined by all 
words in the MEMBERS wordlist whose handles match keys in * 
the class's public, protected, or private member lists. 

The executable definitions associated with entries in MEM- 
BERS are immediate. When members is part of the search or- 
der, a reference to a member may be found there, and it will 
be executed. When it is executed, it will search for a match 
on its handle in the list of keys in the member lists for the 
current class (identified by ' THIS). If a match is found, the 
compilation or execution xt associated with the matching 
member will be executed, depending on STATE. If there is no 
match in the current class, the name will be re-asserted in 
the input stream and the Forth interpreter will be invoked to 



search for it in other wordlists, handling it subsequently in 
normal fashion. 

4.4. Compilation of classes and objects 

One of my goals for SWOOP was to make the definition 
of classes and, in particular, the members of a class, map onto 
the common Forth paradigm, which meant being able to tem- 
porarily supercede the meaning of the Forth defining words. 
I achieved this by having a wordlist called cc-WORDS that 
contains all of the member-defining words, and which is only 
present in the search order while compiling a class. 

The simplest way to discuss the compiler is to walk through 
its operation as a class is built. So, we define a simple class: 
CLASS POINT 

VARIABLE X 
VARIABLE Y 

: DOT ( -- ) X @ . Y . ; 
END-CLASS 

The phrase CLASS POINT creates a class data structure 
named POINT, links it into the CLASSES list, adds CC-WORDS 
and members to the search order, and sets ' THIS and cstate 
to the handle of point. The variable cstate contains the 
handle of the current class being defined, and remains non- 
zero until END-CLASS is encountered. This is used by the vari- 
ous member compilers to decide what member references 
mean, and how to compile them. 

VARIABLE X (and, likewise, Y) executes the class-defining 
word VARIABLE in CC-WORDS, which adds a member name to 
MEMBERS and to the chain of public members for point. 

Although the colon definition DOT looks like a normal 
Forth definition, its critical components : and ; are highly 
specialized words in the cc-WORDS wordlist. This version of 
: searches for the name dot in the members wordlist; if there 
is one already, it uses its handle as the message ID for the 
member being defined. Otherwise, it constructs a name in 
members (rather than with the class definitions being built), 
keeping its handle. Then it begins a : NONAME definition, which 
is terminated by the ; . This version of ; not only completes 
the definition, it uses its xt along with the message ID to 
construct the entry in the appropriate chain for dot. 

When a class member is referenced (such as in the refer- 
ence to X in DOT), its compiler method is executed. This rou- 
tine (such as COLON-METHOD and data-method) compiles a 
reference to the member. 

4.5.Self 

Notice that, seemingly, we have inconsistent use of our 
members. While defining PO INT, we simply reference x; while 
not defining POINT, we must reference an object prior to X. 
This problem is resolved in some systems by requiring SELF 
to appear as an object proxy during the definition of the class. 
: DOT ( — ) SELF X @ . SELF Y @ . ; 

This results in a more consistent syntax, but is wordy and 
repetitive. However, to the compiler, the reference to X is not 
ambiguous, so the explicit reference to self is unnecessary. 
While a class is being defined, SWOOP notices that X (or any 
other member) is indeed a reference to a member of the class 
being defined and automatically inserts SELF before the ref- 
erence is compiled. This results in a simpler presentation of 
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the routine, and makes the code inside a class look like it 
would if it were not part of a class definition at all. 

4.6. Binding 

The way a member is referenced may be decided at com- 
pile time or at run time. 

If the decision is made at compile time, it is known as 
early binding and assumes that a specific, known member is 
being referenced. This provides for simple compilation and 
the best performance when executed. 

If the decision is made at run time, it is known as late 
binding, which assumes that the member to be referenced is 
not known at compile time and must, therefore, be 
looked up at run time. This is slower than early bind- 
ing because of the run-time lookup, but it is more 
general. Because of its interactive nature, this behav- 
ior parallels the use of the Forth interpreter to refer- 
ence members. 

SWOOP is primarily an early binding system, but 
late binding is available through two mechanisms. 
The first is deferred members, a technique that paral- 
lels the Forth concept of a deferred word. This imple- 
ments the facet of late binding where the member 
name to be referenced is known, but the behavior is not yet 
determined when the reference is made. The second is the 
word SENDMSG, which sends an arbitrary message ID to an 
arbitrary object. This strategy makes it possible to, for example, 
send Windows message constants to a window object for pro- 
cessing. 

5. Optimization 

Version 2.0 of SwiftForth (currently in beta release) will 
include both SWOOP and a powerful rule-based optimizing 
compiler. Many of its optimization strategies provide signifi- 
cant improvement on both the size and performance of code 
generated by SWOOP. For example, the sequence: 
CLASS POINT 

VARIABLE X 

VARIABLE Y 

: DOT X ? Y ? ; 
END-CLASS 

CLASS RECT 

POINT BUILDS UL 

POINT BUILDS LR 
END-CLASS 

CLASS CUBE 

RECT BUILDS TOP 

RECT BUILDS BOT 
END-CLASS 

CUBE BUILDS FOO 



Figure Four. Member data structure, 
showing embedded switch 




Switch structure 

A 



Message 
id 



Run- time 
xt 



Data. 



Figure Five 
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...generates the code shown in Figure Five for TESTl, less 
than one machine instruction per Forth word. 



6. Future enhancements 

As noted, SWOOP was designed from the outset to be 
amenable to cross- or target-compiling. This is most obvi- 
ously manifest in the separation of compile-time and run- 
time behaviors for members associated with a class. In a non- 
extensible, ROMable target, the compiler portion of the mem- 
ber data structure would reside in the host during compila- 
tion and interactive testing, and only the run-time support 
(shown in Figure Four) would reside in the target. 

Note that the design of the member data structure incor- 
porates a "switch," as described in my previous article [7]. 
These can be implemented extremely efficiently. Early-bound 
members will simply execute their xts; late-bound members 
will call the run-time switch. 

7. Source code 

The source code is broken into two basic parts: the pre- 
amble PRESWOOP, which Wil Baden presents elsewhere in this 
issue of Forth Dimensions, and the source code for SWOOP itself 
in Listing One. Listing Two provides some simple extensions 
to the object model, showing how to add new data types, etc. 
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Listing One 



^ ==================================================================== 

(C) Copyright 1999 FORTH, Inc. www.forth.com 

FORTH, Inc. grants to members of the Forth Interest Group permission to use this code 
providing the user clearly acknowledges FORTH, Inc. as author. FORTH, Inc. assumes no 
responsibility for the accuracy or completeness of this code. We will greatly appre- 
ciate being notified of any improvements users may make or recommend. 
==================================================================== J 

{ _ 

The following set of words have the most promise of performance 
improvement if optimized with machine code. These inefficient versions 
should be commented out if other versions already exist. 

Classes return their xt when executed. A class's xt is considered 
to be its handle. All class operations are based on this handle. 

'THIS has the handle of the current class and 
'SELF has the address of the current object. 

THIS returns the handle of the current class and 

SELF returns the address of the current data object, normally used 
only while defining a class. 

>THIS writes a new value into 'THIS and 
>SELF writes a new value into 'SELF. 

>C C> >S S> are compiler macros which preserve the values of 
'THIS and 'SELF respectively. They are used in pairs around 
code sequences. 

>C C> save, set, and restore 'THIS. "THIS >R >THIS . . . R> >THIS" 
>S S> save, set, and restore 'SELF. "SELF >R >SELF . . . R> >SELF" 

>DATA returns a data address for the xt of an object 

} 

VARIABLE 'THIS 
VARIABLE 'SELF 

: THIS ( -- class ) 'THIS @ ; 
: SELF ( -- object ) 'SELF @ ; 

: >THIS ( class -- ) 'THIS ! ; 
: >SELF ( object — ) 'SELF ! ; 

: >C ( class — ) 

POSTPONE THIS POSTPONE >R POSTPONE >THIS ; IMMEDIATE 

: C> ( — ) 

POSTPONE R> POSTPONE >THIS ; IMMEDIATE' 

: >S ( object — ) 

POSTPONE SELF POSTPONE >R POSTPONE >SELF ; IMMEDIATE 

: S> ( — ) 

POSTPONE R> POSTPONE >SELF ; IMMEDIATE 

: >DATA ( xt -- object ) >BODY 3 CELLS + ; 
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{ 

CSTATE has the class handle while we are defining a class. 



" SELF' is a compiler tool to emplace a reference to SELF before 
each class-local item while compiling the class. This makes the 
code look nicer; instead of SELF X @ one can just say X @ . 
Pronounce this by wiggling two fingers on each hand in the air 
while saying the word SELF. 

"THIS" emplaces a reference to the current class as necessary for 
resolving defer methods or simply executing a class member. 



VARIABLE CSTATE 

: "SELF' ( — ) 

CSTATE @ -EXIT CSTATE @ THIS <> ?EXIT POSTPONE SELF ; 

: "THIS" ( -- ) CSTATE @ IF 

CSTATE @ THIS = IF POSTPONE THIS EXIT THEN 
THEN THIS POSTPONE LITERAL ; 

{ 

We manage our object system with two system wordlists. 

CC-WORDS has the defining words used while building classes and 
MEMBERS has the unique identifiers for class members. 

+MEMBERS adds the MEMBERS wordlist to the search order and 
-MEMBERS removes it from the search order. 

+CC puts MEMBERS and CC-WORDS on the top of the search order and 
-CC removes them from the search order. 



WORDLIST: CC-WORDS 
WORDLIST: MEMBERS 

: +MEMBERS ( — ) MEMBERS +ORDER ; 
: -MEMBERS ( -- ) MEMBERS -ORDER ; 

: +CC ( -- ) +MEMBERS CC-WORDS +ORDER ; 
: -CC ( -- ) -MEMBERS CC-WORDS -ORDER ; 

{ 

Classes are: 

I link I xt I super | public | protected | private I size | tag | 
>SUPER etc traverse this structure from the^ class handle. 
SIZEOF returns the size of the specif ied 'class . 
I CLASS I is how many cells are required to define a class. 
CLASSTAG is a marker derrived from the xt of | CLASS]. 



: BODY+ ( n "name" -- n+1 ) 

CREATE DUP CELLS , 1+ DOES> @ SWAP >BODY + ; 



} 



} 
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BODY+ >CLINK 

BODY+ >CHANDLE 

BODY+ > SUPER 

BODY+ >PUBLIC 

BODY+ > PROTECTED 

BODY+ > PRIVATE 

BODY+ >SIZE 

BODY+ >CLASSTAG 
CONSTANT I CLASS I 

' I CLASS I CONSTANT CLASSTAG 
' I CLASS I 1+ CONSTANT OBJTAG 

: SIZEOF ( class -- n ) >SIZE @ ; 

{ 

Executing a named class returns its xt, which is its handle. 

When a class is created, THIS will contain the handle of the class 
until END-CLASS is executed. 

CLASSES has the list of all known classes. 

OPAQUE has if new members are PUBLIC, 1 if new members are PROTECTED, 
and 2 if new members are PRIVATE. This is an offset, translated into 
cells from >PUBLIC when used in NEW-MEMBER. 

CLASS defines a new class. With 
SUBCLASS, we use 

INHERITS to build a new class from an existing one. 
RE -OPEN allows further refinements of a class. 

SUPREME is the mother of all classes. Members may be added to 
it with extreme care. 



CHAIN CLASSES 
VARIABLE OPAQUE 

: RE-OPEN ( class — ) DUP >THIS CSTATE ! OPAQUE ! +CC ; 

: (CLASS) ( -- ) CREATE-XT ( xt) DUP RE -OPEN 

CLASSES >LINK ( xt) , | CLASS I 2 - CELLS /ALLOT CLASSTAG , 
DOES> CELL+ @ ; 

(CLASS) SUPREME -MEMBERS -CC 

: INHERITS ( class -- ) 

HERE CELL- @ CLASSTAG <> ABORT" INHERITS must follow CLASS <name>" 

ICLASSI 1- CELLS NEGATE ALLOT \ forge't all except link. 

DUP , \ point superclass field to new parent, 

DUP >PUBLIC RELINK, \ inherit public 

DUP > PROTECTED RELINK, \ and protected. 

, \ never inherit private. 

DUP SIZEOF , \ inherit size. 

CLASSTAG , \ mark this as a class. 

DROP ; 

: CLASS ( -- ) 

(CLASS) SUPREME INHERITS ; 
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: SUBCLASS ( class — ) 
(CLASS) INHERITS ; 

{ 

COMPILE-AN-OBJECT compiles a reference that returns the object's 

address generated by the given xt and adds MEMBERS to the search order. 

INTERPRET-AN-OBJECT returns an object's address. 

(OBJECT) compiles or executes an object reference. 

BUILDS creates a named object which looks like: 
I xt I class I data.... I 

USING sets the class search order so that the MEMBERS wordlist is active. 

The net result is to allow the use of arbitrary class methods on an 

arbitrary address in memory. 
} 

: COMPILE-AN-OBJECT ( addr xt -- ) >R 

@+ POSTPONE LITERAL R> COMPILE, CELL+ @ >THIS +MEMBERS ; 

: INTERPRET-AN-OBJECT ( addr xt -- addr ) >R 
@+ SWAP CELL+ @ >THIS +MEMBERS R> EXECUTE ; 

: (OBJECT) ( addr xt — | addr ) 

STATE @ IF COMPILE-AN-OBJECT ELSE INTERPRET-AN-OBJECT THEN ; 

: BUILDS ( class — ) 

CREATE-XT IMMEDIATE ( xt) , OBJTAG , ( class) DUP , SIZEOF /ALLOT 
DOES> [ '] >DATA (OBJECT) ; 

: USING ( -- ) ' DUP >CLASSTAG @ 

CLASSTAG <> ABORT" Class name must follow USING" 
>THIS +MEMBERS ; IMMEDIATE 

{ 

NEW is the dynamic object constructor and 
DESTROY is the corresponding destructor. 

J 

: NEW ( class — addr ) 

DUP SIZEOF CELL+ CELL+ ALLOCATE THROW OBJTAG !+ SWAP !+ ; 

: DESTROY ( addr — ) 

CELL- CELL- FREE THROW ; 

{ 

A class has three member lists associated with it: public, protected, and 
private These lists indicate which message's the class recognizes and how 
to compile and/or execute the member when referenced. The format of these 
lists is 

I compiler-xt | link | member handle I runtime-xt | data | ... 

The data field varies from method to method. This is documented 
below in the METHODS section. 

The structure of the member list contains an embedded switch statement; 
the link I member I xt pattern. 
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A member handle represents a valid member if it is in the MEMBERS 
wordlist and either the public, protected, or private member list of the 
current class. This represents the namespace of the class. 

NEW-MEMBER builds a list entry for the current class associating the 
.member with compiler and runtime xts and a single data value. 

BELONGS? returns the address of link if the member belongs to the 

current class. BELONGS? should be coded for speed, as it is in the 
critical path for virtual methods. 

PUBLIC? searches the public list, 
PROTECTED? searches the protected list, and 
PRIVATE? searches the private list of THIS . 

CLASS-MEMBER? checks THIS class for the member. Used by RESOLVED, for 
virtual members (DEFER:) and so doesn't check PRIVATE. 

VISIBLE-MEMBER? checks the member lists of THIS class for the member. 
Since this is the action of all members, it must function both 
during class compilaion and during method reference in normal 
compilation . 

If THIS is zero, it fails; no class is current to search. 

If CSTATE is non-zero, we are compiling a class. 

If CSTATE=THIS, the reference is to the current class; search 

public, protected, and private. 
If CSTATEOTHIS, the reference is to another class; search 

public and protected, but not private. 

MEMBER? checks the specified class for the member id on the stack. 
} 

: NEW-MEMBER ( member data runtime-xt compiler-xt -- ) 
ALIGN 

, THIS >PUBLIC OPAQUE @ CELLS + >LINK ROT , , , ; 

: BELONGS? ( member list -- 'member true | member false ) 
BEGIN 

DUP @ DUP WHILE + 

2DUP CELL+ @ = 
UNTIL NIP TRUE EXIT 
THEN NIP ; 

: PUBLIC? ( member -- 'member true | member ) 
THIS > PUBLIC BELONGS? ; 

: PROTECTED? ( member — 'member true I member ) 
THIS > PROTECTED BELONGS? ; 

: PRIVATE? ( member -- 'member true | member ) 
THIS > PRIVATE BELONGS? ; 

: CLASS-MEMBER? ( member — 'member true | ) 
THIS IF 

PUBLIC? DUP ?EXIT DROP 

PROTECTED? DUP ?EXIT DROP 
THEN DROP ; 
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: VISIBLE-MEMBER? ( member — 'member 
THIS IF 

PUBLIC? DUP ?EXIT DROP 
CSTATE @ IF 

PROTECTED? DUP ? EXIT DROP 
CSTATE @ THIS = IF 

PRIVATE? DUP ?EXIT DROP 
THEN 
THEN 
THEN DROP ; 

: MEMBER? ( member class -- 'member t 
> PUBLIC BELONGS? ; 



true I ) 

\ class is selected 
\ exit if in public 
\ compiling a class 
\ exit if in protected 
\ compiling this class 
\ exit if in private 
\ 

\ else normal forth reference 
\ failing 

■ue I member ) 



EARLY-BINDING executes the compiler-xt of the given member, which 
compiles a reference to it according to the member type. 

LATE-BINDING executes the runtime-xt of the given member. All 
members require an object address on the stack when executing. 
This is used for runtime binding (i.e., true late binding) and 
for Forth interpreter access . 

REFERENCE-MEMBER either compiles or executes a member. 

70BJECT throws if the entity whose address is on the stack is not 
an object. 

SENDMSG executes the given member id in the context of the class to 
which the object belongs. This is considered to be sending a 
message . 

RESOLVED looks up the member in the current class and executes it. 
This is used at runtime for late binding of virtual functions . 
We search from the class pointed to by THIS at runtime, and the 
first member match we find is executed. If no better behavior is 
defined than the initial DEFER:, we will find that and execute 
it by default. 

} 



: EARLY-BINDING ( 'member — ) 

DUP 3 CELLS + SWAP CELL - @ EXECUTE ; 

: LATE-BINDING ( object 'member — ) 

OVER CELL- @ >THIS 2 CELLS + @+ EXECUTE ; 



: REFERENCE-MEMBER { [ object] 'member -- ) 
STATE @ IF EARLY-BINDING ELSE 

CSTATE @ IF ( interpreting in a class definition) 

SWAP 2 CELLS + @+ EXECUTE 
ELSE 

LATE-BINDING THIS 0= IF -MEMBERS THEN 
THEN 
THEN ; 



: ? OBJECT ( object -- ) 

2 CELLS - @ OBJTAG <> THROW ; 

: RESOLVED ( member -- ) 

CLASS-MEMBER? 0= THROW 3 CELLS + @ EXECUTE ; 
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Introduction to FIG, threaded code, TO variables, fig-Forth. 

Volume 6 Forth Dimensions (1 984-85) 1 06 - $85 
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Forth, optimizing compiler, dynamic memory allocation, 
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Volume 1 3 Forth Dimensions (1 991 -92) 1 1 3 - $45 

Volume 1 4 Forth Dimensions (1 992-93) 1 1 4 - $45 

Volume 1 5 Forth Dimensions (1 993-94) 1 1 5 - $45 
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FORML CONFERENCE PROCEEDINGS 



The annual FORML Conference is an educational forum for sharing and 
discussing new or unproven proposals intended to benefit Forth, and is 
fordiscussion of technical aspects of applications in Forth. Proceedings 
are a compilation of the papers and abstracts. FORML is an activity of 
the Forth Interest Group. 

1981 FORML PROCEEDINGS 311 - $70 

CODEIess Forth machine, quadruple-precision arithmetic, 
overlays, executable vocabulary stack, data typing in Forth, 
vectored data structures, using Forth in a classroom, pyramid 
files, Basic, LOGO, automatic cueing language for multimedia, 
NEXOS-a ROM-based multitasking operating system. 655 
pp. 

1982 FORML PROCEEDINGS 31 2 -$65 

Rockwell Forth processor, virtual execution, 32-bit Forth, 
ONLY for vocabularies, non-IMMEDIATE looping words, 
number-input wordset, I/O vectoring, recursive data 
structures, programmable-logic compiler. 295 pp. 

1983 FORML PROCEEDINGS 313 - $65 

Non-Von Neuman machines. Forth instruction set, Chinese 
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1984 FORML PROCEEDINGS 31 4 -$65 

Forth expert systems, consequent-reasoning inference engine, 
Zen floating point, portable graphics wordset, 32-bit Forth, 
HP71 B Forth, NEON-object-oriented programming, decom- 
piler design, arrays and stack variables. 378 pp. 

1986 FORML PROCEEDINGS 316 - $65 

Threading techniques, Prolog, VLSI Forth microprocessor, 
natural-language interface, expert system shell, inference 
engine, multiple-inheritance system, automatic programming 
environment. 323 pp. 

1988 FORML PROCEEDINGS 318 - $65 

Includes 1 988 Australian FORML. Human interfaces, simple 
robotics kernel, MODUL Forth, parallel processing, 
programmable controllers, Prolog, simulations, language 
topics, hardware, Wil's workings & Ting's philosophy. Forth 
hardware applications, ANS Forth session, future of Forth in 
Al applications. 310 pp. 

1989 FORML PROCEEDINGS 319-$65 

Includes papers from '89 euroFORML. Pascal to Forth, 
extensible optimizer for compiling, 3D measurement with 
object-oriented Forth, ORG polynomials, F-PC, Harris C 
cross-compiler, modular approach to robotic control, RTX 
recompilerfor on-line maintenance, modules, trainable neural 
nets. 433 pp. 

1 992 FORML PROCEEDINGS 322 - $45 

Object-oriented Forth based on classes rather than 
prototypes, color vision sizing processor, virtual file systems, 
transparent target development, signal-processing pattern 
classification, optimization in low-level Forth, local variables, 
embedded Forth, auto display of digital images, graphics 
package for F-PC, B-tree in Forth 200 pp. 

1 993 FORML PROCEEDINGS 323 - $45 

Includes papers from '92 euroForth and '93 euroForth 
Conferences. Forth in 32-bit protected mode, HDTV format 
converter, graphing functions, MIPS eForth, umbilical 
compilation, portable Forth engine, formal specifications of 
Forth, writing better Forth, Holon - a new way of Forth, 
FOSM - a Forth string matcher, Logo in Forth, programming 
productivity. 509 pp. 

1 994-1 995 FORML PROCEEDINGS (in one volume!) 325 - $55 
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ALL ABOUT FORTH, 3rd ed., June 1990, Glen B. Haydon 201 - $90 

Annotated glossary of most Forth words in common use, 
including Fortti-79, Forth-83, F-PC, MVP-Forth. Implementa- 
tion examples in high-level Forth and/or 8086/88 assembler. 
Useful commentary given for each entry. 504 pp. 



©FORTH IMPLEMENTATION GUIDE, C.H. Ting 



21 5 -$37 



eForth is a Forth model designed to be portable to many of 
the newer, more powerful processors available now and 
becoming available in the near future. 54 pp. (w/disk) 

Embedded Controller FORTH, 8051 , William H. Payne 21 6 - $85 

Describes the implementation of an 8051 version of Forth. 
More than half the book is composed of source listings (w/ 
disks C050) 51 1 pp. 



F83 SOURCE, Henry Laxen & Michael Perry 



21 7 -$30 



A complete listing of F83, including source and shadow 
screens. Includes introduction on getting started. 208 pp. 



F-PC USERS MANUAL (2nd ed., V3.5) 



350 - $30 



Users manual to the public-domain Forth system optimized 
for IBM PC/XT/AT computers. A fat, fast system with many 
tools. 143 pp. 

F-PC TECHNICAL REFERENCE MANUAL 351 - $45 

A must if you need to know F-PC's inner workings. 269 pp. 



THE FIRST COURSE, C.H. Ting 



223 - $37 



This tutorial exposes you to the minimum set of Forth 
instructions needed to use Forth to solve practical problems 
in the shortest possible time. " . . .This tutorial was developed 
to complement The Forth Course, which skims too fast over 
elementary Forth instructions and dives too quickly into 
advanced topics in an upper-level college microcomputer 
laboratory... A running F-PC Forth system would be very 
useful. 44 pp. 



THE FORTH COURSE, Richard E. Haskell 



225 - $37 



This set of 1 1 lessons is designed to make it easy for you to 
learn Forth. The material was developed over several years 
of teaching Forth as part of a senior/graduate course in the 
design of embedded software computer systems at Oakland 
University in Rochester, Michigan. 156 pp. (w/disk) 



FORTH NOTEBOOK, Dr. C.H. Ting 



232 - $37 



Good examples and applications — a great learning aid. 
polyFORTH is the dialect used, but some conversion advice 
IS included. Code is well documented. 286 pp. 



FORTH NOTEBOOK II, Dr. C.H. Ting 



232a - $37 



Collection of research papers on various topics, such as 
image processing, parallel processing, and miscellaneous 
applications. 237 pp. 



FORTH PROGRAMMER'S HANDBOOK, 
Edward K. Conklin and Elizabeth D. Rather 



260 - $57 



This reference book documents all ANS Forth wordsets 
(with details of more than 250 words), and describes the 
Forth virtual machine, implementation strategies, the impact 
of multitasking on program design. Forth assemblers, and 
coding style recommendations. 
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INSIDE F-83, Dr. C.H. Ting 

Invaluable for those using F-83. 226 pp. 
OBJECT-ORIENTED FORTH, Dick Pountain 



235 - $37 



242 - $50 



Implementation of data structures. Rrst book to make 
object-oriented programming available to users of even very 
small home computers. 7 18 pp. 

STARTING FORTH (2nd ed.) Umited Reprint, Leo Brodie 245a- $50 



In this edition of Starting Forth — the most popular and 
complete introduction to Forth — syntax has been expanded 
to include the Forth-83 Standard. (The original printing is 
now out of stoc/c, but we are mal<ing available a special, 
limited-edition reprint with all the original content.) 346 pp. 
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THINKING FORTH, Leo Brodie 



255 - $35 



Back by popular demand! To program intelligently, you 
must first think intelligently. The bestselling author Starting 
Forth is back, with the first guide to using Forth for 
applications. This book captures the philosophy of the 
language, showing users howto write more-readable, more- 
maintainable applications. Both beginning and experienced 
programmers will gain a better understanding and mastery 
of topics like decomposition, factoring, handling data, 
simplifying control structures, Forth style and conventions. 
To give you an idea of how these concepts can be applied, 
Thinking Forth contains revealing interviews with users and 
with Forth's creator, Charles H. Moore. Reprint of original. 
272pp. 

WRITE YOUR OWN PROGRAMMING LANGUAGE USING C++, 

Norman Smith 270 - $35 

This book is about an application language. More specifically, 
it is about how to write your own custom application 
language. The book contains the tools necessary to begin 
the process, and a complete sample language 
implementation. (Guess what language!) Includes disk with 
complete source. 108 pp. 



WRITING FCODE PROGRAMS 



252 -$60 



This manual is for designers of SBus interface cards and 
other devices that use the FCode interface language. It 
assumes familiarity with SBus card design requirements 
and Forth programming. Discusses SBus development for 
OpenBoot 1 .0 and 2.0 systems. 414 pp. 
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joining the growing number of members who choose to show 
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Ask about our special incentives for corporate and library 
members, or become an individual benefactor! 
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Benefactor - $1 25 

Standard - $45 (add $1 5 for non-U.S. delivery) 
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office@fbrth.org 
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